smina / Discussion / Help and Feedback: No residue information on minimized PDBQT

Miro - 2016-03-28

Hello,

I am using smina in order to minimize ligand-protein complexes and I have noticed that the output in PDBQT file containing the binding site produced by smina the residue names have been replaced with UNL.

This poses me two problems:

1.- On the one hand, it is difficult to merge the minimized residues back into the original PDB file.

2.- On the other, some scoring functions expect to find residue names.

I have tried to convert the PDBQT files back to PDB using OpenBabel, but it treats the residues as heteroatoms belonging to a single unknown residue.

Would it be possible to modify the smina code in order to preserve residue information?

Best regards,

Miro

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- David Koes - 2016-03-28
  
  Can you provide a test case?
  
  David Koes
  Assistant Professor
  Computational and Systems Biology
  University of Pittsburgh
  
  On 03/28/2016 08:00 AM, Miro wrote:
  
  Hello,
  
  I am using smina in order to minimize ligand-protein complexes and I
  have noticed that the output in PDBQT file containing the binding site
  produced by smina the residue names have been replaced with UNL.
  
  This poses me two problems:
  
  1.- On the one hand, it is difficult to merge the minimized residues
  back into the original PDB file.
  
  2.- On the other, some scoring functions expect to find residue names.
  
  I have tried to convert the PDBQT files back to PDB using OpenBabel, but
  it treats the residues as heteroatoms belonging to a single unknown residue.
  
  Would it be possible to modify the smina code in order to preserve
  residue information?
  
  Best regards,
  
  Miro
  
  No residue information on minimized PDBQT
  https://sourceforge.net/p/smina/discussion/help/thread/feb3d277/?limit=25#c403
  
  Sent from sourceforge.net because dkoes@pitt.edu is subscribed to
  https://sourceforge.net/p/smina/discussion/help/
  
  To unsubscribe from further messages, a project admin can change
  settings at https://sourceforge.net/p/smina/admin/discussion/forums. Or,
  if this is a mailing list, you can unsubscribe from the mailing list.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Miro - 2016-03-29
    
    This would be the command line:
    
    smina --receptor proteins/${protein}.pdbqt --ligand docking/${ligand}/${result}.pdbqt --flexdist_ligand docking/${ligand}/${result}.pdbqt --out docking/${ligand}/${result}_min.pdbqt --out_flex docking/${ligand}/${result}_protmin.pdbqt --log docking/${ligand}/${result}_min.log --flexdist 6 --minimize
    
    Attached you will find the input and output files (output files contain the string "min").
    
    As you can see, in both output files (ligand and protein flexible residues) all residue names have been changed to "UNL".
    
    I do not know if things would be different if I provide a split protein pdbqt file with flexible and rigid residues. However, letting smina handling this automatically is a lot more convenient when one is dealing with large numbers of proteins and ligands, as it is my case.
    
    Thank you and kind regards,
    
    Miro Moman
    RCSI Molecular Medicine
    Dublin, Ireland
    
    smina_min.tar.gz
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - David Koes - 2016-03-29
      
      The issue here is that OpenBabel's PDBQT writing code eliminates residue
      names (because the flex part only includes side chains, not whole
      residues). We use this code in two place - once at the beginning in
      defining the flexible residues and once at the end to write out a pdbqt
      file, if that is what is requested.
      
      I've committed a workaround for the first case, so we don't lose residue
      information right off the bat. I've also updated smina.static. If you
      output as a pdbqt you will still get UNL residues, but pdb (no qt)
      output will retain the residue names.
      
      Hope this helps,
      
      David Koes
      Assistant Professor
      Computational and Systems Biology
      University of Pittsburgh
      
      On 03/29/2016 04:33 AM, Miro wrote:
      
      This would be the command line:
      
      smina --receptor proteins/${protein}.pdbqt --ligand
      docking/${ligand}/${result}.pdbqt --flexdist_ligand
      docking/${ligand}/${result}.pdbqt --out
      docking/${ligand}/${result}_min.pdbqt --out_flex
      docking/${ligand}/${result}_protmin.pdbqt --log
      docking/${ligand}/${result}_min.log --flexdist 6 --minimize
      
      Attached you will find the input and output files (output files contain
      the string "min").
      
      As you can see, in both output files (ligand and protein flexible
      residues) all residue names have been changed to "UNL".
      
      I do not know if things would be different if I provide a split protein
      pdbqt file with flexible and rigid residues. However, letting smina
      handling this automatically is a lot more convenient when one is dealing
      with large numbers of proteins and ligands, as it is my case.
      
      Thank you and kind regards,
      
      Miro Moman
      RCSI Molecular Medicine
      Dublin, Ireland
      
      No residue information on minimized PDBQT
      https://sourceforge.net/p/smina/discussion/help/thread/feb3d277/?limit=25#c403/445d/d26e
      
      Sent from sourceforge.net because dkoes@pitt.edu is subscribed to
      https://sourceforge.net/p/smina/discussion/help/
      
      To unsubscribe from further messages, a project admin can change
      settings at https://sourceforge.net/p/smina/admin/discussion/forums. Or,
      if this is a mailing list, you can unsubscribe from the mailing list.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Miro - 2016-03-29
        
        Thanks a million! I will download the updated code and give it a try ASAP.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Miro - 2016-03-29
        
        It works! Thank you. The residue name of the ligand is still modified (which could be an issue if it were a peptide, but it is not the case), however, the flexible residues pdb output preserves the original residue names.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Miro - 2016-03-31

OK, the problem now is that it removes the protein atom types and the residue numbers. As a consequence, most programs do not understand the protein structure correctly and it is very difficult to merge the flexible residues back into the original structure.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Miro - 2016-04-02

If at least the residue number could be kept, it would be already useful.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Miro - 2016-04-02

OK, I have writen a bash script to merge smina's minimised flexible residues back into the orginal pdbqt file to allow for rescoring.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Miro - 2016-04-04

This is the relevant part of the script (it profits from the fact that the order of the residues and the atoms within each residue, except for the amide atoms, do not change in the input and output files):

smina --receptor 3HC5_A.pdbqt --ligand docking/${ligand}/${pose}.pdbqt --flexdist_ligand docking/${ligand}/${pose}.pdbqt --out docking/${ligand}/${pose}_min.pdbqt --out_flex docking/${ligand}/${pose}_3HC5_min.pdb --log docking/${ligand}/${pose}_smina.log --flexdist 5 --minimize

Remerge the smina minimised residues into the original pdbqt file for rescoring

grep "^Flexible residues:" docking/${ligand}/${pose}_smina.log | sed "s/Flexible residues: //g" | tr ' ' '\n' | cut -d":" -f2 | while read resnumber; do

grep "^ATOM" 3HC5_A.pdbqt | grep -E "^.{23}${resnumber}" | sed '4d' | sed '1d' | sed "/^.{13}H.*$/d" > docking/${ligand}/split_${resnumber}.pdbqt

done

cat docking/${ligand}/split_* > docking/${ligand}/flexible_residues.pdbqt

rm docking/${ligand}/split_*

grep "^ATOM" docking/${ligand}/${pose}_3HC5_min.pdb | sed "/^.{13}H.*$/d" > docking/${ligand}/flexible_residues.pdb

paste <(cut -c 1-27 docking/${ligand}/flexible_residues.pdbqt) <(cut -c 28-54 docking/${ligand}/flexible_residues.pdb) <(cut -c 55-79 docking/${ligand}/flexible_residues.pdbqt) --delimiters '' > docking/${ligand}/flexible_residues_merged.pdbqt

mv docking/${ligand}/flexible_residues_merged.pdbqt docking/${ligand}/flexible_residues.pdbqt

rm docking/${ligand}/flexible_residues.pdb

sed -i 's/^.{26}/&:/' docking/${ligand}/flexible_residues.pdbqt
sed -e 's/^.{26}/&:/' 3HC5_A.pdbqt > docking/${ligand}/3HC5_A.tmp

awk -F":" 'NR==FNR{a[$1]=$0;next;}a[$1]{$0=a[$1]}1' docking/${ligand}/flexible_residues.pdbqt docking/${ligand}/3HC5_A.tmp > docking/${ligand}/3HC5_A_flex.pdbqt

sed -i 's/://g' docking/${ligand}/3HC5_A_flex.pdbqt

rm docking/${ligand}/flexible_residues.pdbqt docking/${ligand}/3HC5_A.tmp

babel docking/${ligand}/3HC5_A_flex.pdbqt docking/${ligand}/3HC5_A_flex.pdb -d -p7.4

rm docking/${ligand}/3HC5_A_flex.pdbqt

~/MGLTools-1.5.6/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_receptor4.py -r docking/${ligand}/3HC5_A_flex.pdb -o docking/${ligand}/${pose}_3HC5_min.pdbqt -U nphs

rm docking/${ligand}/3HC5_A_flex.pdb

Last edit: Miro 2016-04-04

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Paulette Greenidge - 2019-02-13

Hi,
Has anyone come up with a generic script that allows for the merging of flexible residues with the original PDB ? The beauty of Smina relative to Autodock is that one does not have to prepare separate flexible and rigid protein segments. This simplification would seem to be lost if in order to re-merge the 2 parts this is exactly what one has to do. I cannot figure out a straight forward solution given that several atoms of the original and the flexible residues will overlap with the same coordinates.

Any smart algorithms out there ?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Koes - 2019-02-13

Hi Paulette,

Apologies for the delayed response - I've been a bit swamped and am slowly making my way through my email queue. I had thought we had implemented this feature at some point, but apparently not. Unfortunately, the task is made a bit difficult by the fact that the atom names get lost during docking, but I put together a script that is hopefully not too fragile (but does require all files to be PDB format):
https://github.com/gnina/gnina/blob/master/scripts/makeflex.py

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Paulette Greenidge - 2019-02-13

Dear David,

Thank you very much! I am looking forward to trying the script tomorrow.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Paulette Greenidge - 2019-02-14

Thanks David. I can confirm that your script worked effortlessy and perfectly. For other users, here are some installations that I had to make that are required by the script, but your setup may already have these packages.

pip install -U ProDy
pip install biopython
python -m pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

c4asma - 2019-03-06

Hi David,
I also just tried out that script makeflex.py, but found that in the final protein structure all oxygen and nitrogens, and some carbon atoms from the flex output are missing, and that I am left with loads of H atoms that are not attached to any other atoms.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- David Koes - 2019-03-06
  
  Can you try using PDBs instead of PDBQTs? Do you have an example you can provide?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - c4asma - 2019-03-06
    
    I am actually using PDBs. I have attached an example of the original
    protein and the flex output.
    
    On 06/03/2019 15:30, David Koes wrote:
    
    Can you try using PDBs instead of PDBQTs? Do you have an example you
    can provide?
    
    No residue information on minimized PDBQT
    https://sourceforge.net/p/smina/discussion/help/thread/feb3d277/?limit=25#4cff/b84f
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/smina/discussion/help/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    alternate
    
    flex_example.tar.gz
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

David Koes - 2019-03-07

I pushed a bug fix. Not sure how I didn't catch this before. Give it another try.
81e5ce27064db63dae8df734d8406d60122ffed5

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

c4asma - 2019-03-08

Thank you David, that is working. No heavy atoms are missing anymore.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

No residue information on minimized PDBQT

Scoring and Minimization with AutoDock Vina

Forums

Help

No residue information on minimized PDBQT

Remerge the smina minimised residues into the original pdbqt file for rescoring

No residue information on minimized PDBQT

Scoring and Minimization with AutoDock Vina

Forums

Help

No residue information on minimized PDBQT document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Remerge the smina minimised residues into the original pdbqt file for rescoring

No residue information on minimized PDBQT