Looking for the latest version? Download automap.zip (16.3 kB)
Home
Name Modified Size Downloads / Week Status
Totals: 3 Items   35.4 kB 3
automap1.1 2013-12-02 1 weekly downloads
automap.zip 2013-12-02 16.3 kB 11 weekly downloads
README 2013-12-02 19.0 kB 22 weekly downloads
AutoMap v1.1.1 ============== CONTENTS 1. Requirements 2. Additional requirements 3. Installation 4. Preparing input for AutoMap 5. Running AutoMap 6. Optimizing cumulative sum cutoffs for AutoMap 7. Publications and author contacts 8. Revision history 1. Requirements --------------- AutoMap requires a Perl installation to run. Most modern Linux distributions will include this. AutoMap has been tested using openSUSE 10.x and 11.x, but should run on most modern Linux distributions with a Perl installation. AutoMap has also been tested on Windows XP and 7. For Windows functionality, Cygwin is required. Perl and dos2unix are required to be installed in the Cygwin installation. It is also recommended to install compilers (C++ and Fortran), tcsh and the X Window System (via the xinit package); while not directly required by AutoMap, these packages will greatly enhance Cygwin functionality. AutoMap has not been tested on Mac systems, but should work as long as Perl is available. 2. Additional requirements -------------------------- LigPlot and the Silico toolkit are required to run AutoMap. Maestro and PyMOL are optional requirements, but recommended for visualization. LigPlot is no longer distributed, but has been replaced by LigPlot+, available at no cost to acadamic users from here: http://www.ebi.ac.uk/thornton-srv/software/LigPlus/ It is recommended to obtain the Het Group Dictionary from the PDB for use with LigPlot: http://rcsb-deposit.rutgers.edu/het_dictionary.txt The Silico toolkit can be obtained from the following link: http://silico.sourceforge.net/ Maestro is available at no cost to academic users, and can be obtained from the following link: http://www.schrodinger.com/downloadcenter/10/ The Open Source version of the PyMOL Molecular Graphics System is available here: http://sourceforge.net/projects/pymol/ To process AutoDock output, the Accelrys DS Visualizer is required. A licence for Discovery Studio is not required. DS Visualizer is available at no cost to academic users, and can be obtained from the following link: http://accelrys.com/products/discovery-studio/visualization-download.php Follow the instructions issued with these respective packages to install these additional on your system. AutoMap has been evaluated with LigPlot v4.x and with v5.x available with LigPlot+. Please see the notes in the Installation section for details of use with the LigPlot+ version of LigPlot. Silico v1.01 has been used with AutoMap. Maestro and PyMOL are recommended for visualization and do not impact on the running of AutoMap. Examples of the use of some of the AutoMap steps, along with input/output for these steps are available in a separate download at our SourceForge website (examples.zip). 3. Installation --------------- Copy the scripts to the folder containing the ligand ensemble and receptor files for analysis. The LigPlot executables and associated parameter files must also be copied to this folder. If you have obtained the Het Group Dictionary, this too needs to go in the folder with the ligand, receptor and LigPlot executables. If using the LigPlot+ version of LigPlot, executables for a number of platforms are available within the /lib folder of the LigPlot+ package. Copy the executables relevant to your platform to the folder containing the ligand ensemble and receptor files for analysis. You will also need to copy the default LigPlot parameter file (ligplot.prm) from the /lib/params folder to this folder. 4. Preparing input for AutoMap ------------------------------ AutoMap has been evaluated using docking output from Glide, GOLD, DOCK and AutoDock. It may also be used with ensembles obtained from other molecular docking programs and molecular dynamics simulations, but has not been thoroughly evaluated in all of these applications. The use of AutoDock output requires a modified procedure from that of the other programs. Prior to docking a multi-residue ligand (e.g. carbohydrate or peptide), it is STRONGLY RECOMMENDED that the residues are clearly specified and defined in the input ligand. For multi-residue ligands that contain several of the same residues in a row, it is highly recommended that these are specified with different residue names (e.g., trimannose might be specified as MAN-MAN-MAN, but it is better to specify it as MNA-MNB-MNC, or other way that clearly distinguishes the three residues). Failure to do this will result in both inaccurate mapping of van der Waals interactions and the failure of AutoMap to correctly process AutoDock results. Do not use underscores in ligand names as these will affect proper operation of some of the scripts. The protein and ligand ensemble MUST be prepared as separate files. To prepare the protein file for AutoMap, use the following procedure: 1. Export the protein coordinates as a PDB file. Note that you MUST use the protein coordinates that were used in your docking simulation; if the coordinates have changed in any way, then the results of AutoMap may not work out as expected. 2. Open the resulting PDB file in a text editor. 3. Remove all lines beginning with anything other than "ATOM". If there are "HETATM" lines (e.g., due to metal ions or other co-factors), change these to "ATOM ". It is ABSOLUTELY IMPERATIVE that ALL "TER" and "ENDMDL/END" are removed at this point. 4. Before saving the modified PDB file, ensure that a "TER" line is present at the end of the file. The "TER" line should also contain the atom number one greater than the final "ATOM" line. For example, if the final "ATOM" line is numbered 6535, then the added/altered "TER" line should read: TER 6536 It is important that this "TER" line is only present at the end of the file as this is used by AutoMap to assist in renumbering the ligand atoms for interaction analysis by LigPlot. 5. Once these changes are made, save the resulting file. Note that residue numbering for any protein chain should not exceed 999. Furthermore, the protein MUST have chain identifiers. To prepare the ligand ensemble for AutoMap, the following procedure, which uses Maestro and the Silico toolkit, is highly recommended. This procedure reflects that used in our lab to prepare the ligand ensemble. The first three steps of this procedure may be replaced to avoid the use of Maestro, however, Steps 4 and 5 are required. The file conversion utilities in Silico create several headings within the PDB file of the ligand ensemble which are used by AutoMap. 1. Import the ligand ensemble(s) to be analyzed into Maestro. 2. Select all of the relevant entries and export them into one uncompressed Maestro file. AutoMap can process different ligands within an ensemble, however, each ensemble of a specific ligand must be saved in its own file. Maps will be generated individually for each ligand, and combined across the entire ligand ensemble. Thus, the ligands should be suitably related to one another if the ensemble map is to be of any use (e.g. by containing a common epitope, or being of a particular structural/functional class). 3. Open a terminal in the folder where the ligands were saved, and execute the following command (requires Silico): read_write_mol2 <ligands>.mae where <ligands> is the name of the ligand file. This step will create a file called <ligands>_new.mol2 4. Once Step 3 is complete, execute the following command (requires Silico): read_write_pdb <ligands>_new.mol2 This step will create a file called <ligands>_new_new.pdb 5. Rename <ligands>_new_new.pdb to <ligand>_lib.pdb, where <ligand> is the name of the ligand. The ligand ensemble should not have any chain identifiers after conversion using the Silico tools, even if did have them prior to conversion. It is important that the ligand has no chain identifiers for use with AutoMap. If using GOLD output, the data in the COMPND field should be simplified before proceeding. For example, typically, the COMPND field might look like this for GOLD poses: 1SL5|1sl5_lig|mol2|1|dock1 This string contains information about the protein and ligand that is not required for AutoMap. Furthermore, the pipe symbol will cause problems with AutoMap. In this example, this would be fixed by opening the converted pose library in a text editor, and replacing all instances of "1SL5|1sl5_lig|mol2|1|dock" with a simpler string, such as "ligand". To prepare ligand ensembles obtained using AutoDock, use the following procedure. The procedure is quite complicated compared to the procedure to convert docking output from other programs; this is due to the way in which the atoms are reordered by AutoDock. 1. Run the getallad4clus.pl script in the folder containing the AutoDock DLG output. This will extract the clustered poses from the DLG file. 2. Type the following command: ls –ltr cluster*.pdb > ls.lst 3. Open Accelrys DS Visualizer/Discovery Studio. 4. Execute the allad4conv.pl script in DS Visualizer/Discovery Studio. 5. Import the MOL2 files generated by DS Visualizer/Discovery Studio into Maestro. 6. Select all of the relevant entries and export them into one uncompressed Maestro file (poselib.mae). 7. Open a terminal in the folder where the ligands were saved, and execute the following commands (requires Silico): rm –rf cluster*.mol2 read_write_mol2 poselib.mae read_write_pdb poselib_new.mol2 mv poselib_new_new.pdb poselib.pdb rm –rf poselib.mae rm –rf poselib_new.mol2 These commands remove the MOL2 files generated by DS Visualizer/Discovery Studio, convert the ensemble saved by Maestro into PDB format, and remove the intermediary Maestro and MOL2 files containing the ensemble. 8. Create a text file (allequiv.lst) which contains equivalence tables for all of the ligands present in the ensemble. The equivalence table should look as set out below: >ligandname IDS 1,IDS 5 SGN 2,SGN 4 IDS 3,IDS 3 >... ... Each line starting with a greater-than sign should contain the name of the ligand. This must be identical to the name of the ligand used in the COMPND section of the PDB file, which should be identical across the pose ensemble for that ligand. The subsequent lines indicate the residues that are “equivalent” in the structure. The first part of the line indicates what the residue should be specified as, while the second part indicates what the residue currently appears as. Thus, in the above example, all instances of IDS5 are converted to IDS1. Note that existing all instances of IDS1 are retained as IDS1; only atoms that are named according to anything in the second part of the equivalence table are changed. The third line in the example is included so that instances where IDS3 appears are retained. Once the table is complete, save the file. If running on Windows, this file must be in UNIX format before proceeding. To convert the file to UNIX format, execute the following command (requires dos2unix installed in Cygwin): dos2unix allequiv.lst This will ensure that the file uses UNIX line breaks, and not Windows line breaks. 9. Run the batchad4fix.pl script in the folder containing the converted poselib.pdb. This will execute the fixad4lib.pl script to generate the fixed libraries containing ensembles of poses for each ligand. 10. Run the setupmlp.pl script. This renames the files generated by the batchad4fix.pl script to the _lib format expected by mlp.pl. 5. Running AutoMap ------------------ Prior to running AutoMap, ensure that the AutoMap scripts and LigPlot installation are present in the folder where you wish to run AutoMap, along with the prepared coordinates of the ligand ensemble and protein. The standard way to run AutoMap is to perform interaction analysis of a given protein-ligand ensemble, following by automated site mapping based on the interaction output. To achieve this, use the following procedure: 1. Open a terminal and execute the following command: ./mlp.pl ./<protein>.pdb This will perform interaction analysis using LigPlot for each pose in the ligand ensemble and collect the results for subsequent analysis. 2. Once complete, execute the following command: ./sitemap.pl <hb-cutoff> <vdw-cutoff> where <hb-cutoff> and <vdw-cutoff> are the relevant cumulative sum cutoffs for hydrogen bonding and van der Waals interactions to be used in selecting protein residues of importance to ligand recognition. 3. The output of site mapping is stored in a file called site.map. mlp.pl collects the ligplot.sum, ligplot.hhb and ligplot.nnb files into one file for each type of table (allsum.csv, allhhb.csv and allnnb.csv). If you wish to also collect these files separately for each complex, as well as the visual output of LigPlot, create the following folders in the directory where you run AutoMap prior to executing mlp.pl: pics -> to collect the visual output of LigPlot for each complex sum -> to collect the ligplot.sum file individually for each complex hhb -> to collect the ligplot.hhb file individually for each complex nnb -> to collect the ligplot.nnb file individually for each complex Since this can add several hundred megabytes to the output, this information is not stored by default. sitemap.pl can generate PDB files with the B-factor column modified to represent the percentage contribution to interactions made by each residue. Residues with account for at least 5% of interactions will be visibly colored in these files. To create and render such maps, the following procedure is recommended: 1. Execute the following command: ./sitemap.pl <hb-cutoff> <vdw-cutoff> ./<protein>.pdb This will create the files <protein>_hbd.pdb and <protein>_vdw.pdb, which can be used to render the hydrogen bonding and van der Waals site maps respectively. 2. Open these newly created PDB files in PyMOL. 3. Execute the following command in the PyMOL terminal: cmd.spectrum("b","blue_white_red",selection="<selection-name>",minimum=0, maximum=73) where <selection-name> is the name of the relevant entry in PyMOL. Note that, with the exception of the <selection-name> parameter, the same command in PyMOL can be used to render both the hydrogen bonding and van der Waals site maps. 4. Render the colored proteins as surfaces. In programs other than PyMOL, it should be possible to render the maps in the appropriate colors by selecting the color palette for B/temperature factor. To execute sitemap.pl on mlp.pl output from multiple ligands, execute the following command: ./multisitemap.pl <hb-cutoff> <vdw-cutoff> This command takes the *allsum.csv files generated by mlp.pl and executes sitemap.pl on each one, as well as the original allsum.csv file, which contains interaction data for the entire ligand ensemble. multisitemap.pl can also generate renderable PDB files based on site mapping output of multiple ligands: ./multisitemap.pl <hb-cutoff> <vdw-cutoff> <protein> 6. Optimizing cumulative sum cutoffs ------------------------------------ Prior to optimization, ensure all of the relevant files are present in the folder where you wish to run AutoMap. 1. For each validation system you wish to use, generate a pose ensemble using the cognate ligand, and convert the files into the required formats for AutoMap as described earlier. 2. Execute the following command for each validation system: ./mlp.pl ./<protein>.pdb Rename the resulting allsum.csv file to <protein>_VAL.csv BEFORE running mlp.pl for subsequent validation systems. 3. For each validation system, a list of the contacting residues in the crystallographic complex of the validation system must be supplied, and must be named <protein>.lst. No distinction between residues involved in hydrogen bonding or van der Waals contacts is made. The .lst files should contain as many entries as there are interactions. Each entry must be formatted as below: RESnnnAC where RES is the residue ID of the contacting residue, nnn is the residue number, A is a letter which may appear after the residue number (e.g., in Kabat numbering of antibodies, it is common to have residues numbered 100A, 100B, etc.), and C is the chain identified. Note that the numbers MUST appear in the appropriate columns. Some examples are listed below. ASP 5 A <- aspartate which is #5 in chain A GLN 27 L <- glutamine which is #27 in chain L GLY102 H <- glycine which is #102 in chain H TRP100AH <- tryptophan which is #100A in chain H The simplest way to create the <protein>.lst file is to perform LigPlot on the crystallographic complex, then generate the <protein>.lst file manually based on the output given in ligplot.sum. 4. Once mlp.pl has been performed for each validation system, the allsum.csv files renamed appropriately, and the .lst files created, execute the following command: ./optcutoff.pl <resolution> where <resolution> is intervals at which to perform mapping. If this is not specified, a default resolution of 10% is used. Note that larger resolution values may result in an inaccurate cutoff being chosen, while smaller values will result in the optimization taking significantly longer for a limited increase in accuracy. Resolution values should be selected such that 100 divided by the resolution still results in an integer (e.g., 10, 5, 2.5, etc.) 5. Once the procedure completes, the optimal cutoffs will be displayed on-screen. These cutoffs should be used in subsequent runs of sitemap.pl for protein-ligand systems related to the systems used for validation. 7. Publications and author contacts ----------------------------------- If you use AutoMap in your publication, please cite the following publication: Agostino M, Mancera RL, Ramsland PA, Yuriev E. AutoMap: a tool for analyzing protein-ligand recognition using multiple ligand binding modes. J Mol Graph Model, 2013, 40:80-90. For queries relating to AutoMap, please contact: Dr Mark Agostino mark.agostino@curtin.edu.au Dr Elizabeth Yuriev Elizabeth.Yuriev@monash.edu or via our Sourceforge website: http://ligmap.sourceforge.net/ 8. Revision history ------------------- v1.1.1 Fixed problem with fixad4lib.pl that caused it to be unable to process ligands where multiple residues were equivalent to a single residue v1.1 Initial release
Source: README, updated 2013-12-02