Structure & Docking Model
Introduction
What our models do?
CSMU_NCHU Taiwan conducted two modeling projects. The aim of the two dry lab project is to, first, predict the enhanced performance on Aflatoxin degradation of a novel fusion protein, and then try to connect our modeling results back to the lab results, seeking for a reasonable explanation. The main goal is to explain how our protein work and why it has a better performance comparing to the original protein.
The protein we create from the project is a fusion of MSMEG5998 (an aflatoxin degrading protein)[1] and Thioredoxin (a folding-assisting protein, which can increase the solubility)[2]. In order to produce accurately folded MSMEG5998, we merge the enzyme with another protein, Thioredoxin, which can improve the performance of protein folding, we expect Thioredoxin can help the fusion protein itself to fold accurately. Therefore, with the fusion protein that we created, our aim is to create a high efficiency protein on degrading aflatoxin.
The modeling project is divided into two parts: Protein structure modeling and docking simulation.
First, we developed a 3D protein model that can predict the structure of fusion protein and tell us whether the fusion protein is misfolded or not. Since the active sites of MSMEG5998 toward Aflatoxin (ligand) has not been studied, we predict the binding domain of enzyme with Aflatoxin. Then we use the 3D model to simulate the correct binding position, and thus, help us improve the accuracy of fusion protein in wet lab experiments.
Our experiment is carried out in two different aspects:
1. Building the 3D model of the fusion protein
2. Create a docking simulation of the fusion protein, including the active sites of Thioredoxin and also the binding position of MSMEG5998 with Aflatoxin.
Protein structure modeling
Overview
1. The fusion protein is a combination of two different functional proteins: MSMEG5998 and Thioredoxin. The two different proteins are combined by a linker.[3]
2. The first challenge we’re facing is that there is no existing structure of this protein. The team still manage to predict the model by using similar protein to create a model, the software tool we used is Swiss Model.[4][5]
We only know that MSMEG5998 belongs to FDR-A family, since there is no exact structure of MSMEG5998, so we try to build a reliable model for the purpose below:
i. To visualize the stereoscopic structure of the two proteins.
ii. To make sure that there is no mutual bonding between the proteins, which can cause misfolding.
First of all, we use NCBI to determine the protein sequence we want
Next is to insert a linker into the two proteins
1. Purpose: Maintain the function of the two protein by separating MSMEG5998 and Thioredoxin.
2. The sequence of Linker is:
GGTACCCGGGGATCCCTCGAGGGTGGT.
3. The linker our team add has two additional functions:
i. Contains four restriction sites: Kpn1, Sma1, BamH1, Xho1.
ii. Two glycine are added at the end of the linker to increase the folding space and the stability, hence lower the chance of misfolding. [6]
iii. Now have a fusion protein with the sequence listed below
Visualize the fusion protein model
1. By using RaptorX[7], the protein sequence can be exported in a PDB file.
2. Visualize the structure by using PyMOL.
This is the 3D model of the fusion protein, the green structure presented is the backbone of the fusion protein. Notice that it is mainly divided into two area, which are MSMEG5998 and Thioredoxin. The helix is a secondary structure called alpha helix and the flat arrow-like structure is called beta sheet.
Associate our results with wet lab
After conducting the protein structure modeling, we started to inspect the fusion protein’s function in the wet lab project; that is, to exam whether the fusion protein is performing better than the original MSMEG5998 on degrading Aflatoxin. The assumption toward wet lab project is that since the structure modeling results show no obvious folding error, we speculate the degrading ability toward Aflatoxin is better since Thioredoxin inside the fusion protein might be helping the fusion protein to fold. Please see the wet lab experiments and results here.
Docking modeling
Overview
After the team conducted the wet lab experiments on Aflatoxin degradation, the results show a possibility that the two functional parts in the fusion protein may be accurate, therefore, the team want to proof the concept by simulating the binding position of aflatoxin and the fusion protein, in order to assure our fusion protein can be functional or even with a higher performance as expected. The team detected the possible active sites of the proteins in our project and then stimulated the docking process involving the use of AutoDock and PyMol.[8] By doing so, we are expecting to observe the performance of the fusion protein, and more importantly, to inspect on the improvements from the new protein comparing to the original ones. Please notice that the fusion protein is merged with two different proteins, which is MEMEG5998 and Thioredoxin. Therefore, in the lateral discussion, docking simulation contains two different protein-ligand model, which are “Thioredoxin-Fusion protein” model and” MSMEG5998-aflatoxinB2” model.
The docking simulation of “Thioredoxin-Fusion protein”
1. Since the structure of Thioredoxin has been studied, we can lock down the active site of thioredoxin by use Uniprot. The team found that there are two active site , which are NO. 33 and NO.36 of the sequence.
2. By using NCBI BLAST, the team compared the sequence of the fusion protein with Thioredoxin. The team confirmed that the active sites of fusion protein corresponding to the ones of Thioredoxin are No.33 and 36 , both are Cysteine, C.
3. The team later constructed a fusion protein 3D model and then labelled the active sites by using PyMOL. By creating the model, the team could learn why thioredoxin is helpful toward protein folding since the active sites of Thioredoxin are not facing away from MSMEG5998.
This 3D model shows the surface of the fusion protein, which allows us to grasp the concept of what our protein looks like. The region labeled in red is the possible binding site of Thioredoxin, which maybe can assist the fusion protein itself or other proteins folding.
The structure of the fusion protein (MSMEG5998 part)
1. While the structure of MSMEG5998 remains unknown, the team still manage to predict the model by using similar protein to create a model, the software tool we used is Swiss Model[3] [4].
2. When deciding the model of MEMEG5998, the team used the Swiss Model by comparing the amino acid sequence among the database of protein sequence. There are two main factors lead to two different models, which are by coverage or by identity. The team choose the highest coverage protein sequence to be our model, named” MSMEG5998 Swiss model”.
3. The sequence of the MSMEG5998 by using Swiss model is compared with that of fusion protein by using Uniprot. The team then discovered three similar groups being labeled below, which are likely active sites.
4. The three possible loci corresponding to the fusion protein sequence are:
i. 189,Arginine,R
ii. 214,Glutamine,Q
iii. 246,Alanine,A
Since the pdb. files presented by raptorX were unable to visualize hydrogen bonds of the compound, thus the team used PMViewer v1.5.7 to add on hydrogen bonds and negative charge. (the following pictures are compounds before and after enhancements)
Further enhancements to the compound before docking simulation on MSMEG5998
Under PMViewer, the appearance of the protein before enhancements.
The fusion protein after enhancements, which adds hydrogen and charge to the protein. This process allows the structure and the binding process as real as possible.
Adding ligand to the docking simulation of MSMEG5998-Aflatoxin B2
Search PubChem to locate the ligand, which in this case is AflatoxinB2, and then download the SDF format.
The docking of MSMEG5998 to Aflatoxin B2
1. The settings for Aflatoxin B2 before docking: Minimize the energy, in order to acquire a stabilized compound which is easier to go through the docking simulation.
2. Select the docking function to proceed.
Autodocking area
The possible autodocking area are limited to the three active sites of MSMEG5998 mentioned earlier, which can increase the model’s accuracy. After autodocking, we visualize the result by using PyMOL to create a 3D docking model. The three active sites for docking are tested, and compared to one another. The team finally come up with one ideal active site, which is 214,glutamine,Q.
The docking was processed by Autodock (please visit our software tools page, the cube area is the area our team choose to process the docking stimulation, the results are in the picture below.
This is a side view of the protein macromolecule. The MSMEG5998 active site 214 is presented in red, while the blue compound represents Aflatoxin.
Discussion and Conclusion
1. By using protein modeling techniques, the team predicted a fusion protein with multifunction while one doesn’t inhibit the other, or creating structural failure. Which later on helped us in the wet lab experiment to proceed.
2. With the software tools, the team is able to predict an enhanced fusion protein (MSMEG5998 combined with Thioredoxin) that performs better than the original protein (MSMEG5998).
3. With the cooperation of the wet lab projects, the team is able to confirm the results of the prediction.(Click the button to visit our project’s result.)
4. Future goals:
i. unfortunately, there is a time limit to our project. However, the team would like to continue our modeling project and also put the theory into practice, trying to see whether active site 214 is the actually binding site with Aflatoxin. The team would conduct experiments of point mutation on site 214, to see if the binding affinity changes or not, in order to explain why this site 214 is crucial toward Aflatoxin degradation.
ii. After conducting the two main modeling project, our team successfully predicts the function of our fusion protein; however, the long term goal is that the team envisions our aflatoxin-degrading protein put in to massive and commercialized production. Therefore, our team would want to measure the productivity of our protein, in order to seek for the ideal producing conditions and reach the maximum efficiency.(Click the button to see some of the results from the experiment our team has conducted.)