Team:Thessaloniki/MolecularD

<!DOCTYPE html>

Molecular Dynamics

Molecular Dynamics

Overview

Our team has invested a lot of effort into developing models that simulate all parts of our project. The need to verify the experimental results regarding the calculation of the binding affinity of the DNA-Protein reaction was approached by implementing Molecular Dynamics in our model.

To assess the experimental results, we focused on calculating, in silico, the binding free energy of the reaction using the MM/GBSA and the MM/PBSA methods on trajectories created by Molecular Dynamics simulations on the DNA-Protein complex. Comparing the two and observing the controversy in the two different methods, we tried to evaluate the experimental data and provide valuable feedback regarding the credibility of the results.

Molecular Dynamics were conducted in order to collect data for the complex of NF-κB p65 Homodimer with the DNA binding site at its equilibrium state. The input data for the simulation were derived from the 1RAM structure provided by the Protein Data Bank (PDB). From a variety of MD software, we culminated in using two of the most accurate suits, AMBER and Schrodinger-Desmond. Οverall, we used the AMBER suit for both MM/GBSA and MM/PBSA calculations and the Schrodinger-Desmond software for MM/GBSA calculation. In AMBER, we had to experiment with different parameters of the MM/GBSA and MM/PBSA methods in order to get closer to the value derived from literature.

Figure 1: Binding Free Energy values derived from different methods.

We attempted to calculate the absolute binding affinity, despite the fact we suspected that we might end up with a significant error, as currently there are no robust methods concerning those calculations. After trying out both software and methods, we discarded the results calculated by the Schrodinger-Desmond MMGBSA and the MMGBSA resulting from AMBER because of the abnormally high value of the resulting energy. Finally, the MMPBSA was the closest to the experimental range of energies with the value of -55.11 kcal/mol. This value is still high compared to literature. Despite the unsatisfying results, we did gain insight into the MD simulations providing a useful methodology to process comparable data using MD. The scripts used for the simulations as well as some of the output files are uploaded on our teams github.

Binding Afinity

Binding affinity is the strength of the binding interaction between a single molecule to the ligand/binding partner. This strength is absolutely proportional to the binding free energy ΔG. ΔG of binding is the calculated difference between the energy of the reactants and the products. The lower the binding energy, the stronger the bond between receptor and ligand.

Figure 2: The gibbs energy of a system during a reaction. The reactants create an intermediate complex with high energy and then, while the final products are forming, the energy drops.

In order to calculate ΔG, one way to go is to measure the concentrations of the protein, the DNA and the Protein-DNA complex at the equilibrium state. Ιn more detail, using those concentration data we can calculate the equilibrium constant and therefore the Binding Free Energy of the reaction from the equation 1.

(Eq. 1)

Molecular Dynamics Theory

The process of calculating the Gibbs Energy through a Molecular Dynamic simulation is quite different from the previous approach.

As we describe in more detail below, one reaction trajectory is being produced with a duration of 100 ns. Then the Binding Free Energy for some frames of the trajectory is calculated. The same calculations are done for the complex, the receptor and the ligand, and the frames used are all in an equilibrium state.
The total Gibbs Binding Free Energy is calculated according to the following equation:

(Eq. 2)

We minimized the idle time of our experiments by using a single trajectory approach, which means that all ensembles are extracted from a single molecular dynamic trajectory. That being said, we are making the assumption that the receptor and ligand ensembles are comparable in the bound and the unbound state.
At this point of our experiments we came up on the obstacle of not knowing which method to use to calculate the Binding Free Energy. Molecular Dynamics simulations were not used from professors at our home University. Therefore, we contacted Professor Kolocouris A. at National and Kapodistrian University of Athens. After a skype session with Mr Kolocouris, we were introduced to one of his M.Sc students, Dimitris Stamatis. Dimitris provided assistance on many levels of our work, among others he helped us chose the best method to calculate the Binding Free Energy for our case.
Among the multiple binding free energy calculation methods, we chose the methods, MM/GBSA and MM/PBSA, as they are the most typical when calculating free energies.

According to MM-GB/PB-SA methods, in order to calculate the binding free energy, it is being broken down to its constituents. Every aspect of the equation (2) is calculated separately according to the following equations:

For each term on the right side of (Eq. 2):

(Eq. 3)

(Eq. 4)

(Eq. 5)

The table below shows the method used for each component:

Table I: Methods used to calculate the respective energy values.
Energy Method
Polar part of solvation free energy ΔGpol Generalized Born (GB) model or Poisson–Boltzmann (PB) equation
Non polar part of solvation free energy ΔGnp Solvent accessible surface area (SASA)
Internal Energy, ΔEinternal
Van der Walls bonds, ΔEvdW
Electrostatic Interactions, ΔEele
Molecular-mechanics (MM) classical Hamiltonian mechanics
Entropy of vibration ΔS quasi-harmonic model
Entropy of rotation ΔS statistical mechanical formulas

Poisson-Boltzmann & Generalized Born

The Poisson-Boltzmann equation (PB) describes the theoretical electrostatic interactions of a solute in a solvent which contains ions. Usually this equation is used with approximations as it is computationally expensive.

An approximation to the linearized Poisson-Boltzmann equation is the Generalized Born model. Here, the solute is considered to be composed of spheres whose internal dielectric constant differs from the external solvent.

Input Preparation

We derived our input file (1RAM) from the Protein Data Bank. The input structure of the DNA - NF-KB p65 HOMODIMER, did not contain the hydrogens nor the ions for the neutralization. Therefore, prior to conducting the simulation we had to add them. Then we applied the appropriate force fields. A force field describes the forces that are applied on each atom. Finally ions and water molecules were added to the solvent.

It is important to note that at the AMBER suit we used the leaprc.protein.ff14SB and the leaprc.DNA.OL1 force fields, while in Desmond, we used the OPLS_2005 force field.

Figure 2: Before input preparation the structure consists of the protein and the DNA without hydrogens ions or water (left). After adding all the components, the system is a box that contains the complex and the ions as solutes and water as solvent (right).

Minimization

The structure obtained after the preparation, contains a geometry which may not correspond to the actual minima for the forces we are using, because of the addition of extra molecules. This will result in conflicts and overlaps between the atoms of the system. So, prior to any other simulations we are minimizing the structure. Minimization rearranges the positions of the atoms in order to satisfy the implemented force fields. The minimization time was fairly short in comparison with the total simulation.

Temperature and Pressure

Τill this point our whole system has a temperature of 0 K. The lab experiment was conducted at 310 K, so we have to heat up the system at the desired temperature, with a method that will equally distribute the heat. It is required that our system will be heated from the initial temperature at a slow rate to avoid disrupting the solvation box limits.

Figure 3: The temperature of the system in Kelvin for the first 120 ps of simulation. The graph was obtained from the simulation with AMBER.

Initially, our solvated system is at zero pressure. However, the experiment is being conducted under constant, atmospheric, pressure. Heating the system under constant pressure results in changes in the volume of the solvated box. To avoid this, during the heating we keep the volume constant, while the pressure can change. Once the desired temperature is reached, the restraints are reversed. The desired outcome is to preserve a constant average pressure, and not a constant pressure in each frame.

In figure (5) below we observe that for the first 20ps of the simulation, the pressure remains close to zero. This is not irregular as the simulation occured for constant volume. After 20ps, we observe the pressure rate fluctuating wildly. As the simulation time passes by, the pressure stabilizes to 1 atm. We observe the stabilization of the mean value at an estimated time of 60ps.

Figure 4: The pressure value is calculated in atm (left) and the volume value in A3 (right) of the system between 20-120 ps of simulation. The graphs are obtained from simulation analysis with AMBER suite.

Equilibrium

After successfully preparing our system, we proceed with the generation of the trajectory files. To complete an accurate Molecular Dynamics simulation it is crucial to apply a sufficient time period for the simulation to occur till the equilibrium state. Reaching the equilibrium state is equivalent to reaching the minimum energy value. We conducted multiple simulations in order to evaluate the time our system needs to equilibrate. We came to the conclusion that 100 ns were more than enough for our simulation. It is important to state that the data needed to calculate the binding free energy are derived from a time period that our system is at the equilibrium state. The size of the output files from this process is extremely big. For example the size of a 100 ns trajectory is nearly 1 TB. Thus the trajectory is not uploaded with the code.

We are aware that our system has reached equilibrium when the average distance between the heavy atoms of the system (C,N,O) is stabilizing or fluctuates around an average value. This evaluation measure Root-mean-square deviation of atomic positions (RMSD) was used to reach a conclusion. RMSD is reliable, according to the bibliography, when one wants to know that a system has reached equilibrium.

Figure 5 : RMSD of all atoms of the complex (up-left), RMSD of DNA atoms (up-right) and the RMSD of the protein backbone (down middle). The protein Backbone contains the A carbons of the protein.The graphs are obtained from the simulation conducted with Schrodinger suite.

MM-GB/PB-SA

By using MM/GBSA and MM/PBSA we tried to calculate the absolute free binding energy. As described in theory above, we used MM/GBSA and MM/PBSA methods to analyze the frames of the simulation that has reached equilibrium. Using the mean values of the results we conducted the required calculations as shown in Eq.2.

These methods are mainly used for relative predictions, by comparing a known complex with a mutated one, so we had to try with different parameters in order to have acceptable accuracy. The parameters that we were mainly changing were, the use of an entropy model, the Generalized Born model used and the internal and external dielectric constants. Even so, the possibility of ending up with an energy of another order of magnitude was still high.

The calculations were done to frames that were in equilibrium. Each of these frames was 40 ps away from its neighbors frames. For the period between 250 and 375 ns, 626 frames were analyzed. The output files demonstrating the results are uploaded in the teams github. However some intermediate files are not uploaded due to their size.

Software

We used two software packages for the Molecular Dynamics simulations and the analysis of the results. On our first approach to Molecular Dynamics we used Schrodinger suite. Schrodinger provides a user friendly environment and can be accessed on a free non commercial license. The second software we used is AMBER, a suite of biomolecular simulation programs. At the beginning of our experiments we used AmberTools19, the free module that AMBER community provides.

In detail, we used Schrodinger free version, distributed from DE Shaw to prepare the system and conduct the Molecular Dynamics simulations using a GPU. After the simulation procedure was completed we converted all files derived from Schrodinger to be compatible with AmberTools19 file format. We quickly abandoned the above methodology as it was too difficult, time consuming and most importantly unable to provide us with trustworthy results.

We reached out to AMBER community and Schrodinger for assistance. We were delighted to hear that both companies could provide us with a free full licence of the programs for the competition.
Meanwhile, we came in conduct with Professor Kolocouris Antonios, Associate Professor at the department of Pharmaceutical Chemistry and Organic Chemistry at the National and Kapodistrian University of Athens. Dr Kolocouris referred us to his M.Sc. student, Stamatis Dimitris, member of the Pharmaceutical chemistry laboratory at the National and Kapodistrian University of Athens. Dimitris from that moment and after was our dry lab mentor and provided assistance at every problem we occured.

Having received the full licenses for both Schrodinger and AMBER we redesigned our simulations. We first approached Schrodinger software as it provides a user friendly graphic environment and minimizes the time needed to prepare the simulations. Nonetheless, the results obtained from the RMSD of the ligand (Figure 6) depict high abnormal values. Additionally, the calculated Binding Free Energy from the MM/GBSA method is also surprisingly high as the results fluctuate around the value of 200 kcal/mol. We came to the conclusion that Schrodinger software could not support efficiently our simulations. The results derived were not compatible with the expected values. After conducting the support team of Schrodinger we were informed that primal modules of Schrodinger lack support for DNA-Protein binding. Consequently, we conducted our simulations using AMBER software. The calculations derived from AMBER are presented in detail on the result section.

The code used for the calculations can be found in our GitHub account.

Schrodinger - Desmond

As mentioned above, the results for the Binding Free Energy derived from Schrodinger were abnormally high and we discarded them. More specifically, from the frames at equilibrium state of a 100 ns simulation, the calculated free binding energy with a MMGBSA method was -283.27 kcal/mol. This results to abnormal values of equilibrium constants, K_bind = 3.734*10-210 and K_dissacotiation = 2.678 * 10209. The K_dissacotiation constant according to the bibliography fluctuates around 150.7 ± 29.2 nM. The error appearing in the difference between the values indicates that either the software or the method (MMGBSA) is inappropriate for these calculations.

Since bibliographic data are provided, it is wise to calculate the limits of the desired binding free energy. Thus we are using equation 1 using the two extreme values of the bibliographic data, 121.5 * 10-9 M and 179.9 * 10-9 M.

Table II: K_d values and ΔG binding values according to bibliographic research.
K_d (M) ΔG binding (kcal/mol)
121.5 * 10-9 -9.8134
179.9 * 10-9 -9.0060

Using AMBER we performed a molecular dynamics simulation of 100 ns. From the resulting trajectory we used some of the frames that were at equilibrium to perform the MMGBSA and the MMPBSA methods. These frames were selected from the RMSD diagram shown below.

Figure 7: The RMSD of the heavy atoms through the time.

The RMSD Figure shows that the most stable period was between 45 and 80 ns. Thus we used frames from this period for our calculations. In particular we used the trajectory frames that correspond to the period between 50 and 75 ns . At 84.5 ns the value of the RMSD rises from ~2 A to ~20 A. This “jump” is due to the artifacts introduced by the periodic boundary conditions and are common in molecular dynamics. We just erased the trajectory frames after 84 ns.
The next step was to calculate the binding free energy through the two methods. The calculations were conducted to for the frame corresponding to 50 ns and every 40 ps ever after, until the frame corresponding to 75ns. The results for the different parameters for each method are demonstrated in the following table.

Table III: Results for the different parameters used for each method.
No Generalized - Born indi Dielectric constant of the solute region. exdi Dielectric constant of the solvent region MMGBSA (kcal/mol) MMPBSA (kcal/mol)
1 Model by Onufriev, D. Bashford and D.A. Case 1 80 -168.8995 -55.1109
2 Variant of the above model 1 80 -215.7852 -55.1109
3 Model by Mongan, Simmerling, McCammon, Case and Onufriev 1 80
4 Model by Onufriev, D. Bashford and D.A. Case 4 78 -168.8995 -103.8995
5 Model by Onufriev, D. Bashford and D.A. Case 1 78 -168.8995 -57.5328

From the calculations No 1,2,3 we conclude that the best method for the Generalized -Born method is the Onufriev, D. Bashford and D.A. Case Model. As for the Poisson-Boltzmann method, comparing the results for different values of the Dielectric constant of the solute region and the solvent region we conclude that the values giving the best results are 1 and 80 in respectively.

The best value of the calculated binding free energy is -55.1109 kcal/mol which is far from the experimental value of 9 kcal/mol. As a result we do use this value for comparison with our experimental results.

References

[1]Schrodinger software

[2]AMBER & AmberTools suite

[3]Desmond software

[4]

[5]Adrian E. Roitberg,Holger Gohlke,Nadine Homeyer,Jason M. Swails,T. Dwight McGee Jr. ,Bill R. Miller III (2012) MMPBSA.py: An Efficient Program for End-State Free Energy Calculations.

[6]Christopher B. Phelps, Lei Lei Sengchanthalangsy, Shiva Malek and Gourisankar Ghosh (2000) Mechanism of κB DNA binding by Rel/NF-κB dimers.