read install_readme.txt and run the installcan script
endcaps is another helpful program that will add terminal names to your residues from your built library files after you have run can lib
1. Place this folder in any directory you want. It is a standalone installer for Linux and Mac
2. CAN, can be installed to your system of choice by running our install only for unix based systems. It may work on Mac systems but has not been tested.
3. Please read license agreement as you agree to this by using this software and any of its code.
4. To use the program simply run the main program CAN located in the main folder
5. Add CAN to your path to make it excutable anywhere
example:
export PATH=$PATH:$HOME/bin/can_1.0
5. Add this line to your .bashrc
You can also get more information on how to add programs to your path by searching online.
The Following is basic information how the program works and if you need to edit it for your own needs. Version 3 of the GNU General Public License
***************************************************************************************************************************************************************************
Section 1 is for the main program CAN
************************************************************************************************************************************************************************
This README is for the latest version of can.py, currently CAN 1.0
CAN.py (and later versions) is a script to use a previously created database to name, charge, and atomtype a mol2 file based solely on the connectivity of atoms in a residue.
Optional Flags:
-h, --help Shows this message
-c, --charge Charge the molecule
-a, --atomtype Atomtype the molecule
-n, --name Name the molecule
-O, --overwrite Overwrite the previous mol2 file
-i, --input Input file name, only takes one file name (Default: all mol2 files in a directory)
-o, --output Output file name, only takes one file name
-ct, --charge_type Charge type (Default: ff14sb)
NOTE: The default is to charge, atomtype, and atomname all mol2 files in a directory.
Therefore, it is only necessary to specify -h, -c, or -a when you only want to do one or two of the options.
To run, you need a few things.
First, you should have created a database from which to pull the charges, names, and atomtypes. This can be done with the script canlib.py. This library should be located in the directory:
/home/user/bin/charge_library/
CAN.py will search in this directory for the database. If you would like to change the directory, simply update the path variable on line 256.
Second, you should either mark can.py as executable and source it, or it should be in the same directory as any mol2 files you would like to name, charge, and atomtype. The mol2 files you would like to charge need to have a specific format. The format the Avogadro saves mol2 files is almost correct, but you need to make one small change. At the end of the residue, Avogadro adds the residue number for some reason. For example, it would name the first residue in a peptide ALA1 instead of ALA. Just delete this number from all of the residue names. This can easily be done in Sublime Text using shift + right click to select a column of text.
To run CAN.py, simply run it as you would any other script, providing arguments to tell the script exactly what you want to do. As long as you type enough of the folder name of the charge library to uniquely identify it, CAN.py will find it (provided it is in the directory specified above). There are a couple exceptions coded in. For example, if you want gaff, you would type 'gaff'. However, there is a possibility that the program would select gaff2. I have coded a solution to this particular instance of the problem, but be aware that this is a possible issue depending on the libraries you are using and you may have to code solutions.
PLANNED UPDATES
-Find a better naming scheme for files (any suggestions would be helpful.)
SOME NOTES
-You will need a different library file for the end cap amino acids. These have different connectivity, so a different file is needed. In addition, the first and last residues in the mol2 file will need a different name corresponding to the name of the end cap residue library files. My suggestion is to simply add a C or N to the residue name depending on if it is on the C or N terminus. For example, an ALA on the N-terminus would become NALA.
-Unless you want the default charge type, you need to specify a charge type even if you are not updating the charge. The charge is what is used to find the correct folder to update the name, atom type, and charge. If you do not want to update the charge, you should simply specify the charge type that corresponds to the desired name or atom type.
ADDING ELEMENTS TO FUNCTIONALITY
canlib.py does not have support for all elements, just the most common ones in organic chemistry (if I'm being honest, just the ones we needed.) Currently, there is just support for C, Cl, F, H, N, O, and S. The good news is that it is super easy to add support for any elements you wish.
1. Locate the function called findConnectivity, it should be around line 85. Then, find the lines that are initializing variables for the number of each element. These lines are of the form:
numC = 0
numF = 0
numH = 0
2. Add another line of the element type you need. For instance, if you wanted to add Al, you would add the line:
numAl = 0
I would suggest keeping the elements in alphabetical order to keep things neat, but you can do what you want.
3. Find the lines with a bunch of if/elif/else statements, starting around line 104. They should have the form:
if element == 'c':
numC += 1
elif element == 'h':
numH += 1
...
4. Add another block to this statement. Make sure that it is before the else statement. If you wanted to add Al, you would add the lines:
elif element == 'al'
numAl += 1
5. Repeat steps 3 and 4 for the second set of if/elif/else blocks, starting around line 130.
6. Next, find where the lists total_atoms and atoms are initialized, around lines 154 and 155. Add the variable that counts whatever element you are adding in alphabetical order to the total_atoms list. Add the element symbol in alphabetical order to the atoms list. Again using Al as an example, the new lists would look as follows:
total_atoms = [numAl, numC, numF, numH, numN, numO, numS]
atoms = ['Al', 'C', 'F', 'H', 'N', 'O', 'S']
7. Save the changes.
That's it!
**************************************************************************************************************************************************************************
Section 2 is for CAN lib
***************************************************************************************************************************************************************************
canlib.py is a script to create a database for charging molecules. Our database has the advantage of having totally unique and unambiguous names for each atom in a residue. Our script makes it possible to bypass LeAP when charging molecules, making use of nonnatural amino acids more efficient. Furthermore, our script allows charging of molecules regardless of atom names. The charge is assigned based on connectivity and atom type only. canlib.py creates the database for this charging method.
To run, all you need to do is have a library of mol2 files. Then, do one of two things:
1. Put canlib.py in the same directory as your mol2 files and run it via the command:
python canlib.py
2. Put canlib.py in your bin, mark it as executable (chmod +x canlib.py), make sure your bin is sourced, go to the directory with the mol2 files and run it via the command:
canlib.py
canlib.py will create a folder with your new library at this destination:
/home/user/bin/charge_library
ADDING ELEMENTS TO FUNCTIONALITY
canlib.py does not have support for all elements, just the most common ones in organic chemistry. Currently, there is just support for C, Cl, F, H, N, O, and S. The good news is that it is super easy to add support for any elements you wish.
1. Locate the function called findConnectivity, it should be around line 43. Then, find the lines that are initializing variables for the number of each element. These lines are of the form:
numC = 0
numF = 0
numH = 0
2. Add another line of the element type you need. For instance, if you wanted to add Al, you would add the line:
numAl = 0
I would suggest keeping the elements in alphabetical order to keep things neat, but you can do what you want.
3. Find the lines with a bunch of if/elif/else statements, starting around line 60. They should have the form:
if element == 'c':
numC += 1
elif element == 'h':
numH += 1
...
4. Add another block to this statement. Make sure that it is before the else statement. If you wanted to add Al, you would add the lines:
elif element == 'al'
numAl += 1
5. Repeat steps 3 and 4 for the second set of if/elif/else blocks.
6. Next, find where the lists total_atoms and atoms are initialized, around lines 105 and 106. Add the variable that counts whatever element you are adding in alphabetical order to the total_atoms list. Add the element symbol in alphabetical order to the atoms list. Again using Al as an example, the new lists would look as follows:
total_atoms = [numAl, numC, numF, numH, numN, numO, numS]
atoms = ['Al', 'C', 'F', 'H', 'N', 'O', 'S']
7. Save the changes.
That's it!