PROTEIN CAVITY SEARCH - README
License:-----------------------------------------------------------------------
Copyright (C) 2011 Victor Khangulov - Johns Hopkins University
This file is part of Protein Cavity Search.
Protein Cavity Search is a free software: you can redistribute it
and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>
Description:-------------------------------------------------------------------
This program calculates the local environment of ionizable residues. The
following parameters are calculated for each ionizable atom an ligand atom:
[1] General calculated parameters:
1. Cavity locations
2. Crevice locations
3. Cavity volumes (per protein rotation)
NOTE: Cavity and crevice volumes can be calculated using Connoli
method (option "-m VdW") or solvent accessible surface area
(option "-m SASA).
[2] Ionizable residue calculated parameters:
1. Distance to cavity
2. Distance to crevice
3. Depth of burial
4. Packing density
5. Polar atoms within specified radius
[3] Input files:
1. PDB file with protein structure (only small proteins for now!) or a
valid PDB ID.
File - Pass a local PDB file (i.e. 1STN.pdb)
PDB ID - Get PDB coordinates from RCSB.org using a valid ID
(i.e. 1STN)
2. Parameter file (input.txt)
GRID_SIZE - Size of the grid spacing in Angstroms
MARGIN - Margin (in Ang) added around the protein when
building the grid.
PROBE - Probes used to determine space taken up by
protein atoms.
PACKING_RADIUS - Radius around the atom center within which to
calculate the packing density,
ROTATIONS - Number of times to rotate the protein in
addition to the original location. When this
is set to zero, the calculation is performed
once, when set to one, the calculation is
performed twice and so on.
DIST_TO_CAV - Distance, in Angstroms, between residue
(ionizable atom center) and cavity or at which
the residue is considered to be "in-cavity".
DIST_TO_CREV - Distance, in Angstroms, between residue
(ionizable atom center) and cavity or at which
the residue is considered to be "in-crevice".
WATER_RADIUS - Radius of bulk water molecule (in Ang.). This
is used to determine the location of the first
shell of waters around the protein.
NEIGHBOR_DIST - Calculate the number of polar atoms within these
distances of the ionizable atom centers. This
is a list separated by spaces.
[4] Output file:
1. logs/cs.log - cavity volumes are printed here.
2. env.txt - environmental parameters for atoms of interest
Column Names:
TYPE - Residue type
INDEX - Residue index in the structure
CONF - Residue conformation
ATOM - Atom name
DB - Average depth of burial
DBERR - Average depth of burial error
DCAV - Average distance to cavity
DCERR - Average distance to cavity error
DR - Average distance to crevice
DRERR - Average distance to crevice error
PD - Average packing density (vol protein/vol total)
PDERR - Average packing density error (vol protein/vol total)
NT{X}p{Y} - Type of a olar residue that is within X.Y Angstroms
of the current atom
NI{X}p{Y} - ID of a polar residue that is within X.Y Angstroms
of the current atom
NC{X}p{Y} - Conformation of a polar residue that is within X.Y
Angstroms of the current atom
NA{X}p{Y} - Atom name that belongs to the residue that is
withing X.Y Angstroms of the current atom
3. PDB files with Cavity or crevice locations.
Each rotation, the program generates a pdb file and a dummy atom
file representing cavity or crevice location. When viewed together,
we can visually inspect location of cavities and crevices. This
option is turned on using the "-d" command line argiment.
[5] Input parameters (input.txt)
Repository path: $HeadURL: https://cavity-search.svn.sourceforge.net/svnroot/cavity-search/trunk/README $
Last committed: $Revision: 109 $
Last changed by: $Author: kilka-hamsa $
Last changed date: $Date: 2011-07-21 13:54:45 -0400 (Thu, 21 Jul 2011) $
ID: $Id: README 109 2011-07-21 17:54:45Z kilka-hamsa $
Usage:------------------------------------------------------------------------
python cavity_search.py -i <PDB-file> [options]
Options:
-h, --help show this help message and exit
-i IN_FILE, --input=IN_FILE
input PDB filename or PDB ID
-c CONST_FILE, --const=CONST_FILE
input constants filename
-d DUMMY, --dummy=DUMMY
dummy atom filename.
-m MODE, --mode=MODE SASA, VdW, or None
Examples:
python cavity_search.py -i I92A.pdb
python cavity_search.py -i 1STN
python cavity_search.py -i I92A.pdb -d
python cavity_search.py --input I92A.pdb -c const.txt
Release Notes:----------------------------------------------------------------
[2011.07.21] - v0.21
* Removed threads option
* Changed download of the PDB file to avoid using .gz archives
[2011.07.20]
* Implemented global scope logging
[2011.07.19]
* Simplified printing of protein and dummy atoms
* Moved high level printing functions into iohandler.py
* Removed CSUtils class and moved the classmethods to simple functions
* moved cs_utils to csutils
* Got rid of the Rotor class and made just a list of functions
* Moved importer functions into fileio module
* Got rid of superfluous grid methods in favor of referencing attributes directly
[2011.07.18]
* Added fileio module
* Re-implemented reading of input constants and pdb atoms in a cleaner fashion
* Re-implemented printing of the env parameters in a cleaner fashion
[2011.07.08]
* Some refactoring and cleaning up going on.
* Changed xrange(len(something)) in for loops to enumerate(something).
[2011.07.07]
* (Bug: 3356863) Removed printing of neighbor carbon atoms.
* (Bug: 3356871) Print PDB ID to the log filename. - Need to rethink implementation of my logger later.
* (Bug: 3357763) Change to using xrange for efficiency.
[2011.07.06] - v0.20
* importer.py file missing from the repository - fixed
[2011.06.28]
* void.py, cs_utils.py - Cleaned up printing of void points
* importer.py - fixed removal of downloaded archive file.
* protein.py - In determining neighbor atoms, skip the atoms of the same
residue.
[2011.06.24] - v0.18
* Added a percent removed printout.
* Add importing of PDB files from RCSB.org given a PDB ID.
* Fixed removal of extra bulk points. Incorrect box size was used.
[2011.06.23]
* Remove extra protein atom points to reduce memory consumption
[2011.06.22]
* Remove extra bulk points to reduce memory consumption
[2011.06.17]
* Cleanup some code.
[2011.05.17] - (v0.0.15 unreleased)
* Changed environment output to be in column format with "-1" values entered
in places of missing or omitted data.
[2011.03.03] - v0.0.14
* Added code to calculate packing density. It uses an additional grid that
is processed with a probe size zero (or user specified) and calculates the
raio of protein points to all other points. In that grid, no volumes are
calculated.
[2011.03.03] -v0.0.13
* Added code to determine the atoms in the vicinity of all ionizable atoms.
* Added code to print atoms to the environment output file in nice columns.
[2011.03.01]
* Cleaned up protein class code and implemented the atoms via inheritance
* Added code to read in the radii within which to look for polar atoms and
count them. The code to process that is not yet implemented.
[2011.03.01] - v0.0.12
[2011.02.16]
* Made nicer environment printout.
[2011.02.15] - v0.0.11
[2011.02.09] - v0.0.10
[2011.01.26]
Added protein rotation to calculate depth of burial
[2010.08.04]
Added user option to inflate the cavity/crevice grid points.
[2010.08.03] - v0.0.6
Added code to "inflate" all of the cavity and crevice grid points to spheres
of probe radius. This allows the grid points to fill all of the empty spaces
up to VdW surfaces. I think this results in a Connoli surface.
[2010.08.02] - v0.0.5
10.08.02 - Added output of PDB file with dummy atoms at coordinates of grid
points labeled as cavities.
[2010.08.02] - v0.0.4
Minor changes to the output formatting
[5.27.2010] - v0.0.3
A release to correct SVN repository. No code changes.
[5.27.2010] - v0.0.2
1. Fixed assignment of protein labels (bug 3007081 and 3007353)
2. Fixed double processing of protein atom grid points (bug 3007348)
3. Sped up assignment of protein labels using spherical symmetry (bug 3007385)
4. Added license information (bug 3007529)
v0.0.1 - [5.25.2010] Initial Release.