Team:Bulgaria/Design

Project Design

This year’s project is dedicated to the design of a novel synthetic platform for high throughput identification, isolation, and characterization of peptides with antimicrobial properties (AMPs).

The first step was the identification of putative AMPs in genomic data. To do that, first, we needed a collection of peptide sequences with known antimicrobial activity. We obtained them from The Database of Antimicrobial Activity and Structure of Peptides (DBAASP). This is a manually curated database developed to provide the information and analytical resources to the scientific community in order to develop antimicrobial compounds. We filtered all database entries according to the following criteria: 10-30 amino acids in length, monomeric molecules, no unnatural amino acids, no C/N-terminus modifications, active against Gram+/Gram- bacteria. These criteria were carefully selected in order to ensure high probability for successful heterologous expression of the newly identified AMPs in E. coli. In the end, we had two lists (in FASTA format) with a few hundred peptides each that successfully fulfilled the aforementioned criteria.

To identify novel AMPs, we decided to use a strategy based on sequence similarity to already known AMPs. We opted to do the sequence comparison using different versions of the BLAST algorithm. To perform this, we needed a test genome to mine for AMPs. After a brief discussion, we selected the genome of Carcharodon carcharias (great white shark), which was recently published. We downloaded the data from NCBI and uploaded it (together with the two AMP lists) to Galaxy EU. We selected this platform, because it’s, for one, free, very user-friendly and provides the user with sufficient computational power for genomic-scale analysis. All the following steps were performed using Galaxy EU.

Next, we masked the repeat regions in the great white shark’s genome using DUST. The already masked genome was converted to a BLAST database for an efficient algorithm search using NCBI BLAST+ makeblastdb. Using tBLASTn, we compared the AMPs from our lists to the previously converted genome in order to identify novel putative AMPs. We used a 10-5 E value cutoff and were lucky enough to be able to identify nine sequences.

Next, we designed expression modules for all of them, that consist of a T7 promotor, a strong RBS, an ATG start codon, the AMP-coding sequence, a TAA stop codon and a T7 terminator. These constructs were synthesized as gBlocks and cloned into pSB1C3 vector. We had some difficulties while selecting the right expression strain, since most of our peptides are expected to be toxic for Gram - negative bacteria, like E.coli. Finally, we used KRX cells, kindly provided by Promega, because they allow efficient cloning and T7-driven protein expression without the need for subcloning. Moreover, the T7 polymerase there is under strict transcriptional control that allows the expression of toxic substances.

Next, we initiated experiments to test the functional activity of the isolated AMPs.

In parallel, we developed a BioBrick-compatible system that can enhance the production and purification processes of AMPs in E.coli. It consists of an expression cassette for AMP production that utilizes the DAMP4 protein fusion partner, which masks toxicity and allows cheap and easy purification of the peptide of interest. This module is fully BioBrick-compatible, allowing easy modification of the RBS and the promotor. As a proof-of-principle experiment, we used pBAD from the arabinose operon and a medium RBS. In order to clone short peptides, this cassette contained two Eco31I restriction sites that allow the insertion of annealed oligos via ligation. Upon testing such cloning, we found that many of the colonies did not contain our insert of interest. To improve usability, we developed a novel part that expresses red chromoprotein and can serve as a selection tool (analogous to pSB1C3). This strategy significantly improved the success ratio of the cloning procedure. We also applied it to an old part (gRNA expression vector) and we successfully modified it, too.

The protein expression, driven by the cassette, was monitored via SDS-PAGE. Despite its advantages, the DAMP4 expression module suffers one major drawback - it does not allow the production of AMPs with intact N-terminus. To overcome this, we developed a design for a future second version of that cassette that has EDDIE instead of DAMP4. EDDIE allows efficient self-cleavage that produces peptides with no extra amino acids at the N-terminus.