Prepare the latest Gromacs, and install programs according to GROMACS’s documentation. In the following explanation ERmod package is installed under (ERmod directory)
and the GROMACS installation directory is assumed to be /path/to/gromacs
. Also, GROMACS is assumed to be installed without any suffix (without _d
/ _mpi
). If you compiled GROMACS with these suffixes, replace gmx
with an appropriate program name.
Set PATH for GROMACS. User should be able to run gmx
(GROMACS 2016 or later) without errors. Otherwise, set path to GROMACS by:
source /path/to/gromacs/bin/GMXRC
And test whether gmx
runs without problem. If they do not work, see the GROMACS installation manual.
Unpack the ERmod data archive (https://sourceforge.net/projects/ermod/files/data_example/).
In the directory there are several input files useful to this tutorial.
Make a ethanol-water solution from the solute topology file:
cp etoh.top etohsolution.top
gmx solvate -cp etoh.gro -box 3.2 3.2 3.2 -cs spc216.gro -maxsol 1000 -o solution.gro -p etohsolution.top
The resulting structure, solution.gro
, contains 1 EtOH molecule (in all-atom model) and 1000 water molecules. Similarly, a pure solvent (water only) box is generated as follows:
gmx solvate -box 3.2 3.2 3.2 -cs spc216.gro -maxsol 1000 -o solvent.gro -p wat.top
To obtain the solvation free energy, user run the solution system of interest, the pure solvent system, and the isolated solute in vacuum.
The simulation of isolated solute is not necessary if the solute is treated as rigid.
For the solution and pure solvent systems, the system is first minimized with min.mdp
and is then equilibrated with nvt.mdp
and further with npt.mdp
.
Similarly, the isolated solute is minimized and equilibrated with solute_min.mdp
and solute_eq.mdp
.
After the equilibration, get the output trajectory as the xtc file.
All bond lengths may be constrained for speeding up calculation.
For detailed instruction consult the GROMACS manual.
gmx grompp -f min.mdp -c solution.gro -p etohsolution.top -o solution_min.tpr
gmx mdrun -s solution_min.tpr -o solution_min.trr -e solution_min.edr -g solution_min.log -c solution_min.gro
gmx grompp -f nvt.mdp -c solution_min.gro -p etohsolution.top -o solution_nvt.tpr
gmx mdrun -s solution_nvt.tpr -o solution_nvt.trr -x solution_nvt.xtc -e solution_nvt.edr -g solution_nvt.log -c solution_nvt.gro
gmx grompp -f npt.mdp -c solution_nvt.gro -p etohsolution.top -o solution_npt.tpr
gmx mdrun -s solution_npt.tpr -o solution_npt.trr -x solution_npt.xtc -e solution_npt.edr -g solution_npt.log -c solution_npt.gro -cpo solution_npt.cpt
# If you are using GROMACS <= 2020, or want to use anisotropic scaling, use "pcouple = Parrinello-Rahman" instead of "C-rescale" in mdp files.
# Note when you use P-R, it is advised you remove first ca. 100 ps from the trajectory, as there will be an initial box fluctuation.
# To prevent it please check "-cpo" checkpointing in mdrun and "-t" option in grompp.
gmx grompp -f solution_run.mdp -c solution_npt.gro -t solution_npt.cpt -e solution_npt.edr -p etohsolution.top -o solution_run.tpr
gmx mdrun -s solution_run.tpr -o solution_run.trr -x solution_run.xtc -e solution_run.edr -g solution_run.log -c solution_run.gro
gmx grompp -f nvt.mdp -c solvent.gro -p wat.top -o solvent_nvt.tpr
gmx mdrun -s solvent_nvt.tpr -o solvent_nvt.trr -x solvent_nvt.xtc -e solvent_nvt.edr -g solvent_nvt.log -c solvent_nvt.gro
gmx grompp -f npt.mdp -c solvent_nvt.gro -p wat.top -o solvent_npt.tpr
gmx mdrun -s solvent_npt.tpr -o solvent_npt.trr -x solvent_npt.xtc -e solvent_npt.edr -g solvent_npt.log -c solvent_npt.gro -cpo solvent_npt.cpt
gmx grompp -f solvent_run.mdp -c solvent_npt.gro -t solvent_npt.cpt -e solvent_npt.edr -p wat.top -o solvent_run.tpr
gmx mdrun -s solvent_run.tpr -o solvent_run.trr -x solvent_run.xtc -e solvent_run.edr -g solvent_run.log -c solvent_run.gro
# use "large enough" box size.
gmx editconf -f etoh.gro -c -o etoh_box.gro -box 5.0 5.0 5.0
gmx grompp -f solute_min.mdp -c etoh_box.gro -p etoh.top -o solute_min.tpr
gmx mdrun -s solute_min.tpr -o solute_min.trr -e solute_min.edr -g solute_min.log -c solute_min.gro
gmx grompp -f solute_eq.mdp -c solute_min.gro -p etoh.top -o solute_eq.tpr
gmx mdrun -s solute_eq.tpr -o solute_eq.trr -x solute_eq.xtc -e solute_eq.edr -g solute_eq.log -c solute_eq.gro
gmx grompp -f solute_run.mdp -c solute_eq.gro -e solute_eq.edr -p etoh.top -o solute_run.tpr
gmx mdrun -s solute_run.tpr -o solute_run.trr -x solute_run.xtc -e solute_run.edr -g solute_run.log -c solute_run.gro
# The molecule may be split due to the PBC.
# You will be asked about groups. Select "ETOH" and "System" respectively for centering and output (in fact any choice should result in the same).
gmx trjconv -s solute_run.tpr -f solute_run.xtc -center -pbc mol -o solute_run_pbc.xtc
Example configuration files are available on https://sourceforge.net/projects/ermod/files/data_example/.
MD production runs are performed in NPT condition, over 100 ps for the solution system to take 10,000 snapshots with 10 fs interval, and over 50 ps for the pure solvent to take 500 snapshots with 100 fs interval.
An MD of isolated solute is done over 50 ns in addition, to take 500,000 snapshots with 100 fs interval.
Theoretically rigorously speaking, the solute system we are currently running above is wrong.
The solute system corresponds to a run in non-pbc & vacuum system, but simulation of such state was now removed from recent GROMACS.
We have two good approximations:
(1) is good because it significantly save your calculation time. The approximation error originates from vdW energy difference (rvdw=1.2 vs. rvdw=2.0). As long as you are using long vdw cutoff in solution/reference system (say >= 1.2 nm), this method is probably superior. In this case vdw energy between 1.2 nm to 2.0 nm is almost negligible and will not likely affect the sampling. This tutorial uses this approximation.
(2) is time-consuming to simulate but theoretically better. This can also be used with smaller vdw cutoff in solution/solvent system. If you want to use this method, you need to disable self energy reweighting (wgtslf=0
in refs/parameters_er
and slfslt='not'
in parameters_fe
; see Parameter files for erdst and Parameter files for slvfe). Also, you will probably want to change the sampling length (50ns) to much shorter one. In this tutorial we are using sampling time of 50 ns just because the simulation is fast enough with method (1) and no reason to run long for this sized molecule. Its convergence depends on molecule types but it should be fast enough for small molecules. If it's slow-converging you will need methods for reweighting anyway.
In the following example, the trajectory files for solution system, pure solvent system, isolated solute system are named solution_run.xtc
, solvent_run.xtc
, and solute_run_pbc.xtc
, respectively. The log files are named similarly.
Input configuration file for erdst and slvfe will be generated via helper scripts.
First, execute the following command to generate system information:
(ERmod directory)/share/ermod/tools/GROMACS/gen_structure --top etohsolution.top
where (ERmod directory) is the directory where the ERmod programs are installed; the directory was specified with the option of --prefix of ./configure described in General build guide.
The option --top can be simplified to -t, and takes a topology file as an argument.
When the GROMACS runs are done in the example procedures described above, "etohsolution.top" is the argument.
Upon execution, user is asked to state which molecule is the solute.
In the example case, the ethanol molecule is the solute; type “Ethanol” (do not include quotation marks) to proceed.
After the execution, two directories are created, namely soln and refs.
Inside these directories, soln/MDinfo
, soln/MolPrm1
, soln/SltInfo
, refs/MDinfo
, refs/MolPrm1
, and refs/SltInfo
files are generated.
See Parameter files for erdst for detailed description of SltInfo, MolPrm1, and so on.
If gen_structure terminates with an error message about the molecule type, see the top of this page and try
source (path to gromacs)/bin/GMXRC
soln/MDinfo should read as
FRAME 2
1 1000
9 3
The second column of the first line indicates that there are 2 types of molecular species.
The second line states that the system contains 1 EtOH molecule and 1000 water molecules.
The last line states that 9 atoms are present within a single ethanol molecule and that 3 atoms are present within a single water molecule.
In the reference system (refs directory,) MDinfo file is also generated.
But the number of molecular species in refs/MDinfo is 1, since MDinfo in this case carries the information of the pure solvent system.
Both in soln and refs directories, MolPrm1 and SltInfo should be found.
MolPrm1 is the parameter file for solvent species, and SltInfo is the parameter file for solute.
Both files contain the Lennard-Jones parameters and charge information, which are necessary in the calculation.
See Parameter files for ermod for detailed description of MDInfo, SltInfo, MolPrmX (X = 1, 2, ...), and so on.
Now the parameters for the solution system are prepared.
Move to soln directory and run the helper script:
cd soln
(ERmod directory)/share/ermod/tools/GROMACS/gen_input --traj ../solution_run.xtc --log ../solution_run.log
The trajectory (xtc file) and log files are fed to the script
When the GROMACS runs are done in the example procedures described above, "../solution_run.xtc" and "../solution_run.log" are the arguments.
The options --traj and --log can be simplified to -x and -l, respectively.
After running the script, a symbolic link file, namely HISTORY, is to be created.
A file named parameters_er is also to be created; this file controls how the erdst program runs.
MDinfo is further to be updated; “FRAME” on the first line is replaced by the number of snapshots in trajectory.
Start the solute-solvent interaction calculation in the solution system with:
(ERmod directory)/bin/erdst
If the program runs successfully, there should be files named engsln.xx and aveuv.tt, where xx is a two-digit integer, besides several other outputs.
The former is the histogram of the solute-solvent pair interaction energy, while the latter is the averaged sum of the solute-solvent interaction energy.
The two-digit numbers indicate which fraction of the trajectory the file corresponds to.
For example, the trajectory is divided into 10 blocks.
engsln.01 is then the histogram obtained from the first one of the 10 blocks, and engsln.02 is from the second one of the 10 blocks.
Correspondingly, the output distribution functions are obtained from each of the 10 blocks.
The division into blocks is done to examine the convergence of the solvation free energy, as described in the end of Quick Start Guide ---- Running slvfe: Getting Final output
Then move on to refs directory to start calculating the reference part.
cd ../refs
(ERmod directory)/share/ermod/tools/GROMACS/gen_input --traj ../solvent_run.xtc --log ../solvent_run.log --flexible ../solute_run.xtc
(ERmod directory)/bin/erdst
The first two arguments are to be taken similarly to the case of the solution (of course, the xtc and log files are changed to those of the solvent system without solute).
The third argument, --flexible, specifies that the test-particle insertion calculation is performed with flexible structures of the solute.
When the GROMACS runs are done in the example procedures described above, "../solvent_run.xtc", "../solvent_run.log", and "../solute_run_pbc.xtc" are the arguments.
The option --flexible can be simplified to -s.
In addition to a symbolic link named HISTORY, a symbolic link file named SltConf is generated; HISTORY refers to the trajectory of the solvent, solvent_run_pbc.xtc, and SltConf is linked to the trajectory file of the isolated solute, solute_run.xtc.
After running erdst, user obtains the energy distributions for the insertion operation.
engref.xx is the histogram of the solute-solvent pair interaction energy at insertions in reference system, and corref.xx is the correlation matrix.
User may notice that there are 5 outputs of energy distribution functions.
This is because the reference trajectory is divided into 5 sections,
and the output distribution functions are obtained from each of the 5 blocks.
When two erdst runs are finished in the soln and refs directories, the solvation free energy can be calculated from the distribution functions obtained.
Go to Quick Start Guide ---- Running slvfe: Getting Final Output.
A sample output is prepared as slvfe-samle.log.
The trajectory is divided into a set of blocks both for soln and refs.
The default value of the number of divisions is 10 for the solution system and is 5 for the reference.
The number of divisions can be modified with the numdiv parameter in the parameters_er file when the erdst program is run.
See Parameter files for erdst for the description of the parameters which can be specified in parameters_er.
If the solute is rigid (its intramolecular structure is fixed), the procedure is simplified.
The scheme for the solution part is the same as above.
Note, of course, that the solution MD is supposed to be done using a rigid solute ("freezegrp = " option is useful).
Further, it is not necessary to perform a simulation of isolated solute when the solute is rigid.
The use of rigid solute simplifies the operation of the reference part.
In the refs directory, run the helper script:
cd ../refs
(ERmod directory)/share/ermod/tools/GROMACS/gen_input --traj ../solvent_run.xtc --log ../solvent_run.log --rigid ../etoh.gro
(ERmod directory)/bin/erdst
The difference from the flexible solute case is the third argument.
The third argument, --rigid, specifies that the test-particle insertion calculation is performed with fixed structure of the solute.
When the GROMACS runs are done in the example procedures described above, "../solvent_run.xtc", "../solvent_run.log", and "../etoh.gro" are the arguments.
The option --rigid can be simplified to -r.
The structure of the solute is taken from the SltInfo file when the solute is rigid.
See Parameter files for ermod for the format of the SltInfo file in the case of rigid solute.
When the refs calculation is done, the calculation of solvation free energy of rigid solute can be done with the same procedure as that in the flexible case.
Wiki: FixedLigandBind
Wiki: QuickStartGuide
Wiki: build-Guide
Wiki: parameters-erdst
Wiki: parameters-slvfe