Team:Calgary/Model/InSilicoEmulsionSystemVerification

Modelling

In Silico Emulsion System Verification

Inspiration

In order for us to know whether it would be possible to use a water soluble chlorophyll-binding protein (6GIX) in an emulsion, it was necessary for us to assess the stress placed on the hydrophilic protein by being in an emulsion, which contains hydrophobic elements. To understand the stress placed on 6GIX, we investigated the nanoscale dynamics of the protein within different systems. These observations are incredibly difficult and expensive to obtain within the laboratory, so we turned towards in silico measurements and observations instead. Through the use of molecular dynamic simulations and a supercomputer, we were able to assess the nanoscale interactions of 6GIX with different components of the emulsion.

Questions

To guide our molecular dynamic modelling, we generated four questions that would have the largest impact on our project design.

Question 1:

What are the dynamics of the protein in an aqueous environment?


Question 2:

Will the individual monomers of the 6GIX protein aggregate to form a tetramer structure?


Question 3:

How will the protein hold up in a non-aqueous environment?


Question 4:

How comfortable is 6GIX in an emulsion on the nanoscale?


Question 5:

How does the inclusion of secretion tags affect our protein folding and stability?

Methodology

How were these models generated?

All simulations were conducted in GROMACS 18.1 on the cpu2019 partition of the ARC computing cluster at the University of Calgary. A similar model generating method was used utilizing the same hardware.

Step 1. Converting Structure files into GROMACS files. The pdb files generated from structure prediction modelling or provided by the protein data bank were converted to a .gro file. Afterwards, we placed it in a theoretical cube with the dimensions set to fit around the protein with at least a nanometer of padding in all dimensions.

Step 2. Solvate the Box. Once the protein was in a theoretical box we systematically placed solvent molecules to fill the box. If two solvents are necessary, they can be completed one at a time.

Step 3. Run Energy Minimization. Once the box had been solvated we ran energy minimization to ensure a stable system for the future steps. This is done through a relaxation stage known as energy minimization to ensure no catastrophic issues with the system. This step also makes sure that solvents are realistically aligned for the next steps. This step usually takes 2 hours.

Step 4. Running Isothermal-Isochoric Equilibration. This step looks at ensuring that the solvent and our protein are stable together at the temperature we are looking to simulate. This step usually takes 4 hours.

Step 5. Running Isothermal-Isobaric Equilibration. This step is similar to the previous step, but it accounts for the pressure and density instead of temperature. This step usually takes 4 hours.

Step 6. Atom by Atom Molecular Dynamics. Once the system was equilibrated, we ran molecular dynamics. This simulates the atom by atom movement of the solvated box for a given amount of time. This step will generate a file containing the trajectories for every atom of the simulation. Due to the extreme computational requirements of this step, molecular dynamics are computed for nanoseconds at a time. This step takes 23 hours per nanosecond.

Step 7.Visualize the Trajectory. In order to visualize the dynamics of the protein we had to take the large numeric files and make them understandable to our wetlab. To make these visualizations, our team used VMD and Pymol. We recommend using Pymol, as it is easier to install and more user-friendly. Although it is harder to install and use, VMD also is very powerful for visualization and has many other applications.


Assumptions

Assumption 1. We assume that the water molecule model(spc216) is representative of what would happen in the real world.

Assumption 2. In modelling 6GIX we consistently make the assumption about its structure. Particularly we assume that the structure provided on the protein data bank sufficiently constitutes the actual nano scale structure of the protein.

Assumption 3. Other assumptions made in these models are associated with the use of Ewald Electrostatics and Verlet cutoff schemes. These determine how we model the flow of force in the system. Due to these being commonly used in industry we were comfortable with including these assumptions within the model.

Assumption 4. The largest assumption for our models was the time frame in which we simulated for. At times within the nanosecond range we truly see a very small snapshot of the total dynamics for the protein. Due to the nanoscale size of the protein however we are comfortable with the use of these time frames.

Results and Wet-lab Integration

Where and how our models were applied


The models generated were able to answer the key questions our wetlab had about the nanoscale properties of the protein.

Question 1. What are the dynamics of the protein in an aqueous environment?

The tetramer structure was maintained and the protein was stable throughout the nanosecond simulation.

Figure 1: 6GIX in water

This enabled our team to understand the stability of our protein in water and the stability of the tetramer structure. This model was also the easiest to accomplish, which allowed our dry lab to become more familiar with the software.

Question 2.Will the individual monomers of the 6GIX protein aggregate to form a tetramer structure?

This model showed that the protein monomer's long-range forces allowed for the formation a tetramer when individual monomers came close enough in contact. We also found that the protein was less likely to form a tetramer without chlorophyll binding.

Figure 2: 6GIX monomers in a 20nm cube of water. Monomers in this range will aggregate over time to form a tetramer.

This model reassured our team that it would function to bind chlorophyll in an emulsion, because it would not immediately aggregate upon protein folding, thereby inhibiting chlorophyll binding. This model was extremely complex and required over 80 hours of computing time.

Question 3.How will the protein hold up in a non-aqueous environment?

From this model, we discovered that 6GIX in oleic acid, a non-polar solvent, loses its tetramer structure, and each individual monomer denatures.

Figure 3: 6GIX denatured in oleic acid, a nonpolar solvent

This confirmed our expectations that a nonpolar solvent, such as canola oil, can be harmful to 6GIX's structure. This model emphasizes the importance of keeping 6GIX out of the oil phase of the emulsion for proper function.

Question 4. How comfortable is 6GIX in an emulsion on the nanoscale?

The bisolvent system simulation showed that the 6GIX protein was comfortable in the emulsion at the nanoscale. The protein remains folded and fully functional within the aqueous layer.

Figure 4: 6GIX in an emulsion. The protein remains in the aqueous center of the model.



Question 5. How does the inclusion of excretion tags affect our protein folding and stability?

To determine if the inclusion of a secretion tag would be acceptable for our protein we generated a dynamics simulation for each of our secretion tags. From the generated models we looked at the root mean square deviation(RMSD) for the entire simulation. This gave a way for us to understand how the tagged protein stacked up against the 6GIX wild type. The proteins with the secretion tag was observed to have lower deviation when undergoing dynamic simulation as seen below. Due to this indication of stability we were comfortable with the use of a secretion tag for our 6GIX protein.

Figure 5: Root mean square deviation (RMSD) for 6GIX versus 6GIX with the Pho secretion tag.

All together, these models helped us understand the response 6GIX has to its given environment, and helped verify the molecular-scale viability of 6GIX in an emulsion.

The impact of this modelling can be felt within multiple aspects of our project. The parts that directly benefitted from this modelling are our 6GIX protein(BBa_K3114006), our ModGIX protein(BBa_K3114007), and our 6GIX protein with several secretion tags(BBa_K3114016) (BBa_K3114017) (BBa_K3114018) (BBa_K3114019) (BBa_K31140020) (BBa_K3114021) (BBa_K3114022) (BBa_K3114023). Taken in full these models informed the generation of several proteins and the system in which they were implemented.

Future Directions

When looking to model the dynamics of our system in the future, the two largest criteria for growth are in the time simulated for, and the size of the system we are simulating. To accomplish growth in these areas we will need to gain access to larger computational resources and gain more familiarity with the software.

In addition to the increased time and size of our simulations, there is another simulation that we believe would greatly inform the project. This simulation would be to determine the ligand binding potential for 6GIX with chlorophyll. This simulation is achievable with our current resources and will be possible as we grow more confidence with GROMACS.

References

Palm, D. M., Agostini, A., Averesch, V., Girr, P., Werwie, M., Takahashi, S., . . . Paulsen, H. (2018). Chlorophyll a/b binding-specificity in water-soluble chlorophyll protein. Nature Plants,4(11), 920-929. doi:10.1038/s41477-018-0273-z

M.J. Abraham, D. van der Spoel, E. Lindahl, B. Hess, and the GROMACS development team, GROMACS User Manual version 2018, www.gromacs.org(2018)

The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC, 2019, www.pymol.org

Humphrey, W., Dalke, A. and Schulten, K., "VMD - Visual Molecular Dynamics", J. Molec. Graphics, 1996, vol. 14, pp. 33-38.