GROMACS mdp file parameters
A Molecular Dynamics (MD) simulation can be carried out within different conditions, for different time lengths, using different algorithms to solve the equations of motion, and so on.
To make sure that you can perform all the different kinds of simulation according to your needs GROMACS allows you to specify several parameters.
These parameters need to be inserted in a so-called Molecular Dynamics parameters file (mdp file), a simple text file in the mdp
extension.
This file is then going to be used as input in the gmx grompp module to assemble a tpr file needed to run the simulation.
mdp files
As you may know, setting up an MD simulation is not a trivial task. It requires different steps, and some of them may differ depending on your system.
Almost in every case, you will need to perform different simulations to bring your system to equilibrium before running a definitive simulation to extract relevant data.
During this process, you will need an mdp file for each simulation you want to run. Furthermore, you will need to change the parameters in every file to adjust each simulation according to your needs.
The list of possible parameters is huge. In this article, I will only discuss the most important ones and I will give you a few templates that you can use for some general protein simulations.
You can always refer to the official guide for more information.
In most cases, you will use fairly standard values for your simulations and you won’t have to change them significantly.
However, always keep in mind that the parameters may need to be adjusted depending on your system. These templates are just intended as a starting point that you can modify and adapt to your specific case.
Let’s briefly discuss the mdp files for the most common kinds of simulation.
mdp file for energy minimizations
The first mdp file we are going to consider is the one for the energy minimization.
During this procedure, we are simply interested in minimizing the potential energy of the system to avoid steric clashes.
A typical mdp file for a minimization run is the one you can find in the code box below.
|
|
The file already contains some comments (specified by the semicolumn “;”) with information about each parameter.
As you can see the file is composed of two paragraphs.
The first one is telling GROMACS some basic info about the run such as:
-
The algorithm we want to use for the minimization.
integrator=steep
means that we are using the well-known steepest descent method. You can also use more sophisticated algorithms such as the conjugate gradient (cg
) or thel-bfgs
. This will not be necessary for the majority of the systems. -
The maximum number of steps for the minimization (
nsteps=50000
). If it does not converge within this number of steps the run will end anyway. -
The convergence is given by
emtol=1000
. This means that the run will be considered as converged when the maximum force ($F_{max}$) is smaller than 1000 kJ mol$^{-1}$ nm$^{-1}$. -
The step size of the minimization is given by
emstep=0.01
(nm)
The second paragraph is dedicated to the treatment of:
- Electrostatic and Van der Waals (VDW) interactions
- Implementation of Periodic Boundary Conditions (PBC)
For this part, we usually have some standard values that are conserved in the other mdp files that you will find in this article. For instance, it is generally recommendable to use a cut-off value in the range of 1.0-1.2 nm for both Electrostatic (rcoulomb = 1.0
) and VDW (rvdw=1.0
) interactions, and to use the Particle Mesh Ewald (PME) method to treat the long-range non-bonding interactions.
mdp file for NVT equilibrations
The second mdp file we will consider is a template for an equilibration in the NVT ensemble.
This is usually required to bring the system to the desired temperature.
Find the mdp file for a general NVT equilibration in the code box below.
|
|
Let’s go through some of the most important parameters in the text file.
In the first line, we apply position restraints to the protein via the define=-DPOSRES
keyword. Position restraints are contained in the posre.itp
file that you create via the gmx pdb2gmx
module, and it allows us to equilibrate the solvent around the protein without causing significant structural changes in the protein.
Then we have a section dedicated to the Run parameters
|
|
Here we specify:
1. The type of algorithm we want to use to solve the equations of motion. The leap-frog integrator was selected in this case (integrator=md
)
2. The duration of the simulation is given by the time step we select (dt
) as well as the number of steps to perform (nsteps
). In this specific case, we selected a time step of 2 fs (dt=0.002
) and we decided to perform 50.000 steps (nsteps=50000
). The overall simulation time ($t$) is given by the time step multiplied by the number of steps.
$$ t = dt \cdot nsteps = 0.002 \cdot 50000=100000fs = 100ps = 0.1ns $$
Based on the complexity of your system the simulation time required to equilibrate may vary a lot. There is no rule of thumb to know a priori how much time your system will need to equilibrate. A little bit of experience can help. You can check if the system has equilibrated by looking at the RMSD value.
The following paragraph is telling GROMACS how frequently to save information to the output files.
|
|
- The
nstxout=500
andnstvout=500
parameters mean that we are writing the coordinates and the velocities to the trajectory file in the trr format every 500 time steps.
$$ t = dt \cdot nsteps = 0.002 \cdot 500 = 1000 fs = 1 ps $$
If you want the trajectory in a less-consuming format (xtc) you can use the nstxout-compressed
parameter instead of the ones shown in the template.
- The
nstenergy=500
parameter is needed to communicate to GROMACS how often we want to save the energies in the edr output file. You can then use this file to extract thermodynamic properties from the simulation via thegmx energy
module.
- The
nstlog=500
specifies how frequently to save information to the log file, which provides a summary of the run.
Let’s consider the next section.
|
|
The purpose of this box is to mainly determine two things:
1. We are not continuing this simulation starting from another run (continuation=no
). This is often the case for an NVT equilibration.
2. We want to constraint the bonds of the protein using the LINCS algorithm.
I am going to skip the sections related to the electrostatic and VdW interactions that were already discussed in the previous mdp file. Let’s focus on the section where we specify the conditions for the experiment.
|
|
To bring the system to the desired temperature we need an algorithm to simulate a thermostat. GROMACS provides you with different algorithms. You can decide the algorithm to use by setting the tcoupl
parameter. The V-rescale
thermostat is appropriate for a general equilibration. However, for more complex systems you may need to use less sophisticated algorithms such as the berendsen
.
You can create separate groups for the thermostat with the tc-groups
parameter. It is generally recommendable to create at least two groups, consisting of the protein (Protein
) and the rest of the system (Non-protein
). You can pass any GROMACS index
you want.
Finally, we can choose the desired temperature for both groups with the ref_t
parameter and the coupling constant with tau_t
.
pcoupl=no
). We will discuss this in more detail in the following mdp file.
In the last section of the file, we simply need to decide if we want to assign initial velocities to our system.
|
|
Since the NVT equilibration is generally the first step we need to assign initial velocities using the keyword gen_vel=yes
. The velocities will be selected from a Maxwell-Boltzmann distribution at the temperature that you choose for the gen_temp
parameter.
mdp file for NPT equilibrations
The NPT equilibration is required to bring the system to the desired pressure.
|
|
Most of the parameters in this file were already discussed in the previous parts of this article.
Here I am just going to highlight the main difference with respect to the previous file for the NVT equilibration.
|
|
As you can see in this case the pressure coupling is on, meaning that we adjust the size of the box to reach the desired pressure value. In other words, we are going to simulate a barostat.
Different barostat algorithms are available in GROMACS. In this case, we selected via the pcoupl
parameter one of the most common, the Parrinello-Rahman
. For the equilibration phase, you may also use less sophisticated barostats such as the Berendsen
.
We can select the ideal pressure value in bars with ref_p=1.0
, and the time constant with tau_p=2.0
. Furthermore, the size of the box is going to be scaled uniformly in all three dimensions due to pcoupltype=isotropic
. When simulating a protein in membrane environment it is recommendable to use a semiisotropic
scaling (isotropic scale in x,y axis and different in z axis)
The NPT simulation is usually the second equilibration step. For this reason, we are going to select continuation=yes
so that we can continue from the previous simulation in the NVT ensemble.
Moreover, we don’t need to generate new velocities as they have already been assigned in the previous step (gen_vel=no
). Velocities are read from the previous file
mdp file for production runs
Last but not least we have the mdp file for a production run.
This is generally a simulation within the NPT ensemble with no restraints on the protein.
|
|
Production runs are usually longer than the previous simulations. For this reason, it is preferable to save the coordinates in a less memory-consuming trajectory file format (xtc instead of trr).
This is specified in the “Output control” section.
|
|
The output parameters that we previously saw (nstxout
, nstvout
) are all set to zero.
We write the coordinated in the xtc file every 5000 steps (nstxout-compressed=5000
). We also need to specify that we want to save the whole system by passing the compressed-x-grps=System
parameter. You can also decide to only save the protein (compressed-x-grps=Protein
).
define=-DPOSRES
line as we don’t need position restraints on the protein anymore.