2. Additional requirements
4. Preparing input for AutoMap
5. Running AutoMap
6. Optimizing cumulative sum cutoffs for AutoMap
7. Publications and author contacts
8. Revision history
AutoMap requires a Perl installation to run. Most modern Linux distributions
will include this. AutoMap has been tested using openSUSE 10.x and 11.x, but
should run on most modern Linux distributions with a Perl installation. AutoMap
has also been tested on Windows XP and 7. For Windows functionality, Cygwin is
required. Perl and dos2unix are required to be installed in the Cygwin
installation. It is also recommended to install compilers (C++ and Fortran),
tcsh and the X Window System (via the xinit package); while not directly
required by AutoMap, these packages will greatly enhance Cygwin functionality.
AutoMap has not been tested on Mac systems, but should work as long as Perl is
2. Additional requirements
LigPlot and the Silico toolkit are required to run AutoMap. Maestro and PyMOL
are optional requirements, but recommended for visualization.
LigPlot is no longer distributed, but has been replaced by LigPlot+, available
at no cost to acadamic users from here:
It is recommended to obtain the Het Group Dictionary from the PDB for use with
The Silico toolkit can be obtained from the following link:
Maestro is available at no cost to academic users, and can be obtained from
the following link:
The Open Source version of the PyMOL Molecular Graphics System is available
To process AutoDock output, the Accelrys DS Visualizer is required. A licence
for Discovery Studio is not required. DS Visualizer is available at no cost to
academic users, and can be obtained from the following link:
Follow the instructions issued with these respective packages to install these
additional on your system.
AutoMap has been evaluated with LigPlot v4.x and with v5.x available with
LigPlot+. Please see the notes in the Installation section for details of use
with the LigPlot+ version of LigPlot. Silico v1.01 has been used with AutoMap.
Maestro and PyMOL are recommended for visualization and do not impact on the
running of AutoMap.
Examples of the use of some of the AutoMap steps, along with input/output for
these steps are available in a separate download at our SourceForge website
Copy the scripts to the folder containing the ligand ensemble and receptor
files for analysis. The LigPlot executables and associated parameter files
must also be copied to this folder. If you have obtained the Het Group
Dictionary, this too needs to go in the folder with the ligand, receptor and
If using the LigPlot+ version of LigPlot, executables for a number of platforms
are available within the /lib folder of the LigPlot+ package. Copy the
executables relevant to your platform to the folder containing the ligand
ensemble and receptor files for analysis. You will also need to copy the
default LigPlot parameter file (ligplot.prm) from the /lib/params folder to this
4. Preparing input for AutoMap
AutoMap has been evaluated using docking output from Glide, GOLD, DOCK and
AutoDock. It may also be used with ensembles obtained from other molecular
docking programs and molecular dynamics simulations, but has not been
thoroughly evaluated in all of these applications. The use of AutoDock output
requires a modified procedure from that of the other programs.
Prior to docking a multi-residue ligand (e.g. carbohydrate or peptide), it is
STRONGLY RECOMMENDED that the residues are clearly specified and defined in the
input ligand. For multi-residue ligands that contain several of the same
residues in a row, it is highly recommended that these are specified with
different residue names (e.g., trimannose might be specified as MAN-MAN-MAN, but
it is better to specify it as MNA-MNB-MNC, or other way that clearly
distinguishes the three residues). Failure to do this will result in both
inaccurate mapping of van der Waals interactions and the failure of AutoMap to
correctly process AutoDock results.
Do not use underscores in ligand names as these will affect proper operation of
some of the scripts.
The protein and ligand ensemble MUST be prepared as separate files.
To prepare the protein file for AutoMap, use the following procedure:
1. Export the protein coordinates as a PDB file. Note that you MUST use the
protein coordinates that were used in your docking simulation; if the
coordinates have changed in any way, then the results of AutoMap may not
work out as expected.
2. Open the resulting PDB file in a text editor.
3. Remove all lines beginning with anything other than "ATOM". If there are
"HETATM" lines (e.g., due to metal ions or other co-factors), change these
to "ATOM ". It is ABSOLUTELY IMPERATIVE that ALL "TER" and "ENDMDL/END"
are removed at this point.
4. Before saving the modified PDB file, ensure that a "TER" line is present at
the end of the file. The "TER" line should also contain the atom number one
greater than the final "ATOM" line. For example, if the final "ATOM" line
is numbered 6535, then the added/altered "TER" line should read:
It is important that this "TER" line is only present at the end of the file
as this is used by AutoMap to assist in renumbering the ligand atoms for
interaction analysis by LigPlot.
5. Once these changes are made, save the resulting file.
Note that residue numbering for any protein chain should not exceed 999.
Furthermore, the protein MUST have chain identifiers.
To prepare the ligand ensemble for AutoMap, the following procedure, which
uses Maestro and the Silico toolkit, is highly recommended. This procedure
reflects that used in our lab to prepare the ligand ensemble. The first three
steps of this procedure may be replaced to avoid the use of Maestro, however,
Steps 4 and 5 are required. The file conversion utilities in Silico create
several headings within the PDB file of the ligand ensemble which are used by
1. Import the ligand ensemble(s) to be analyzed into Maestro.
2. Select all of the relevant entries and export them into one uncompressed
Maestro file. AutoMap can process different ligands within an
ensemble, however, each ensemble of a specific ligand must be saved in its own
file. Maps will be generated individually for each ligand, and combined
across the entire ligand ensemble. Thus, the ligands should be suitably
related to one another if the ensemble map is to be of any use (e.g. by
containing a common epitope, or being of a particular structural/functional
3. Open a terminal in the folder where the ligands were saved, and execute the
following command (requires Silico):
where <ligands> is the name of the ligand file. This step will create a
file called <ligands>_new.mol2
4. Once Step 3 is complete, execute the following command (requires Silico):
This step will create a file called <ligands>_new_new.pdb
5. Rename <ligands>_new_new.pdb to <ligand>_lib.pdb, where <ligand> is the
name of the ligand.
The ligand ensemble should not have any chain identifiers after conversion
using the Silico tools, even if did have them prior to conversion. It is
important that the ligand has no chain identifiers for use with AutoMap.
If using GOLD output, the data in the COMPND field should be simplified before
proceeding. For example, typically, the COMPND field might look like this for
This string contains information about the protein and ligand that is not
required for AutoMap. Furthermore, the pipe symbol will cause problems with
AutoMap. In this example, this would be fixed by opening the converted pose
library in a text editor, and replacing all instances of
"1SL5|1sl5_lig|mol2|1|dock" with a simpler string, such as "ligand".
To prepare ligand ensembles obtained using AutoDock, use the following
procedure. The procedure is quite complicated compared to the procedure to
convert docking output from other programs; this is due to the way in which
the atoms are reordered by AutoDock.
1. Run the getallad4clus.pl script in the folder containing the AutoDock DLG
output. This will extract the clustered poses from the DLG file.
2. Type the following command:
ls ltr cluster*.pdb > ls.lst
3. Open Accelrys DS Visualizer/Discovery Studio.
4. Execute the allad4conv.pl script in DS Visualizer/Discovery Studio.
5. Import the MOL2 files generated by DS Visualizer/Discovery Studio into
6. Select all of the relevant entries and export them into one uncompressed
Maestro file (poselib.mae).
7. Open a terminal in the folder where the ligands were saved, and execute the
following commands (requires Silico):
rm rf cluster*.mol2
mv poselib_new_new.pdb poselib.pdb
rm rf poselib.mae
rm rf poselib_new.mol2
These commands remove the MOL2 files generated by DS Visualizer/Discovery
Studio, convert the ensemble saved by Maestro into PDB format, and remove
the intermediary Maestro and MOL2 files containing the ensemble.
8. Create a text file (allequiv.lst) which contains equivalence tables for all
of the ligands present in the ensemble. The equivalence table should look
as set out below:
IDS 1,IDS 5
SGN 2,SGN 4
IDS 3,IDS 3
Each line starting with a greater-than sign should contain the name of the
ligand. This must be identical to the name of the ligand used in the COMPND
section of the PDB file, which should be identical across the pose ensemble
for that ligand. The subsequent lines indicate the residues that are
equivalent in the structure. The first part of the line indicates
what the residue should be specified as, while the second part indicates what
the residue currently appears as. Thus, in the above example, all instances
of IDS5 are converted to IDS1. Note that existing all instances of IDS1 are
retained as IDS1; only atoms that are named according to anything in the
second part of the equivalence table are changed. The third line in the
example is included so that instances where IDS3 appears are retained.
Once the table is complete, save the file. If running on Windows, this file
must be in UNIX format before proceeding. To convert the file to UNIX format,
execute the following command (requires dos2unix installed in Cygwin):
This will ensure that the file uses UNIX line breaks, and not Windows line
9. Run the batchad4fix.pl script in the folder containing the converted
poselib.pdb. This will execute the fixad4lib.pl script to generate the fixed
libraries containing ensembles of poses for each ligand.
10. Run the setupmlp.pl script. This renames the files generated by the
batchad4fix.pl script to the _lib format expected by mlp.pl.
5. Running AutoMap
Prior to running AutoMap, ensure that the AutoMap scripts and LigPlot
installation are present in the folder where you wish to run AutoMap, along
with the prepared coordinates of the ligand ensemble and protein.
The standard way to run AutoMap is to perform interaction analysis of a given
protein-ligand ensemble, following by automated site mapping based on the
interaction output. To achieve this, use the following procedure:
1. Open a terminal and execute the following command:
This will perform interaction analysis using LigPlot for each pose in the
ligand ensemble and collect the results for subsequent analysis.
2. Once complete, execute the following command:
./sitemap.pl <hb-cutoff> <vdw-cutoff>
where <hb-cutoff> and <vdw-cutoff> are the relevant cumulative sum cutoffs
for hydrogen bonding and van der Waals interactions to be used in selecting
protein residues of importance to ligand recognition.
3. The output of site mapping is stored in a file called site.map.
mlp.pl collects the ligplot.sum, ligplot.hhb and ligplot.nnb files into one
file for each type of table (allsum.csv, allhhb.csv and allnnb.csv). If you
wish to also collect these files separately for each complex, as well as the
visual output of LigPlot, create the following folders in the directory where
you run AutoMap prior to executing mlp.pl:
pics -> to collect the visual output of LigPlot for each complex
sum -> to collect the ligplot.sum file individually for each complex
hhb -> to collect the ligplot.hhb file individually for each complex
nnb -> to collect the ligplot.nnb file individually for each complex
Since this can add several hundred megabytes to the output, this information
is not stored by default.
sitemap.pl can generate PDB files with the B-factor column modified to
represent the percentage contribution to interactions made by each residue.
Residues with account for at least 5% of interactions will be visibly colored
in these files. To create and render such maps, the following procedure is
1. Execute the following command:
./sitemap.pl <hb-cutoff> <vdw-cutoff> ./<protein>.pdb
This will create the files <protein>_hbd.pdb and <protein>_vdw.pdb, which
can be used to render the hydrogen bonding and van der Waals site maps
2. Open these newly created PDB files in PyMOL.
3. Execute the following command in the PyMOL terminal:
where <selection-name> is the name of the relevant entry in PyMOL. Note
that, with the exception of the <selection-name> parameter, the same
command in PyMOL can be used to render both the hydrogen bonding and van
der Waals site maps.
4. Render the colored proteins as surfaces.
In programs other than PyMOL, it should be possible to render the maps in the
appropriate colors by selecting the color palette for B/temperature factor.
To execute sitemap.pl on mlp.pl output from multiple ligands, execute the
./multisitemap.pl <hb-cutoff> <vdw-cutoff>
This command takes the *allsum.csv files generated by mlp.pl and executes
sitemap.pl on each one, as well as the original allsum.csv file, which contains
interaction data for the entire ligand ensemble.
multisitemap.pl can also generate renderable PDB files based on site mapping
output of multiple ligands:
./multisitemap.pl <hb-cutoff> <vdw-cutoff> <protein>
6. Optimizing cumulative sum cutoffs
Prior to optimization, ensure all of the relevant files are present in the
folder where you wish to run AutoMap.
1. For each validation system you wish to use, generate a pose ensemble using
the cognate ligand, and convert the files into the required formats for
AutoMap as described earlier.
2. Execute the following command for each validation system:
Rename the resulting allsum.csv file to <protein>_VAL.csv BEFORE running
mlp.pl for subsequent validation systems.
3. For each validation system, a list of the contacting residues in the
crystallographic complex of the validation system must be supplied, and
must be named <protein>.lst. No distinction between residues involved in
hydrogen bonding or van der Waals contacts is made. The .lst files should
contain as many entries as there are interactions. Each entry must be
formatted as below:
where RES is the residue ID of the contacting residue, nnn is the residue
number, A is a letter which may appear after the residue number (e.g., in
Kabat numbering of antibodies, it is common to have residues numbered 100A,
100B, etc.), and C is the chain identified. Note that the numbers MUST
appear in the appropriate columns. Some examples are listed below.
ASP 5 A <- aspartate which is #5 in chain A
GLN 27 L <- glutamine which is #27 in chain L
GLY102 H <- glycine which is #102 in chain H
TRP100AH <- tryptophan which is #100A in chain H
The simplest way to create the <protein>.lst file is to perform LigPlot on
the crystallographic complex, then generate the <protein>.lst file manually
based on the output given in ligplot.sum.
4. Once mlp.pl has been performed for each validation system, the allsum.csv
files renamed appropriately, and the .lst files created, execute the
where <resolution> is intervals at which to perform mapping. If this is not
specified, a default resolution of 10% is used. Note that larger resolution
values may result in an inaccurate cutoff being chosen, while smaller values
will result in the optimization taking significantly longer for a limited
increase in accuracy. Resolution values should be selected such that 100
divided by the resolution still results in an integer (e.g., 10, 5, 2.5,
5. Once the procedure completes, the optimal cutoffs will be displayed
on-screen. These cutoffs should be used in subsequent runs of sitemap.pl
for protein-ligand systems related to the systems used for validation.
7. Publications and author contacts
If you use AutoMap in your publication, please cite the following publication:
Agostino M, Mancera RL, Ramsland PA, Yuriev E. AutoMap: a tool for analyzing
protein-ligand recognition using multiple ligand binding modes. J Mol Graph
Model, 2013, 40:80-90.
For queries relating to AutoMap, please contact:
Dr Mark Agostino firstname.lastname@example.org
Dr Elizabeth Yuriev Elizabeth.Yuriev@monash.edu
or via our Sourceforge website:
8. Revision history
v1.1.1 Fixed problem with fixad4lib.pl that caused it to be unable to process
ligands where multiple residues were equivalent to a single residue
v1.1 Initial release