Menu

No residue information on minimized PDBQT

Miro
2016-03-28
2019-03-08
  • Miro

    Miro - 2016-03-28

    Hello,

    I am using smina in order to minimize ligand-protein complexes and I have noticed that the output in PDBQT file containing the binding site produced by smina the residue names have been replaced with UNL.

    This poses me two problems:

    1.- On the one hand, it is difficult to merge the minimized residues back into the original PDB file.

    2.- On the other, some scoring functions expect to find residue names.

    I have tried to convert the PDBQT files back to PDB using OpenBabel, but it treats the residues as heteroatoms belonging to a single unknown residue.

    Would it be possible to modify the smina code in order to preserve residue information?

    Best regards,

    Miro

     
    • David Koes

      David Koes - 2016-03-28

      Can you provide a test case?

      David Koes
      Assistant Professor
      Computational and Systems Biology
      University of Pittsburgh

      On 03/28/2016 08:00 AM, Miro wrote:

      Hello,

      I am using smina in order to minimize ligand-protein complexes and I
      have noticed that the output in PDBQT file containing the binding site
      produced by smina the residue names have been replaced with UNL.

      This poses me two problems:

      1.- On the one hand, it is difficult to merge the minimized residues
      back into the original PDB file.

      2.- On the other, some scoring functions expect to find residue names.

      I have tried to convert the PDBQT files back to PDB using OpenBabel, but
      it treats the residues as heteroatoms belonging to a single unknown residue.

      Would it be possible to modify the smina code in order to preserve
      residue information?

      Best regards,

      Miro


      No residue information on minimized PDBQT
      https://sourceforge.net/p/smina/discussion/help/thread/feb3d277/?limit=25#c403


      Sent from sourceforge.net because dkoes@pitt.edu is subscribed to
      https://sourceforge.net/p/smina/discussion/help/

      To unsubscribe from further messages, a project admin can change
      settings at https://sourceforge.net/p/smina/admin/discussion/forums. Or,
      if this is a mailing list, you can unsubscribe from the mailing list.

       
      • Miro

        Miro - 2016-03-29

        This would be the command line:

        smina --receptor proteins/${protein}.pdbqt --ligand docking/${ligand}/${result}.pdbqt --flexdist_ligand docking/${ligand}/${result}.pdbqt --out docking/${ligand}/${result}_min.pdbqt --out_flex docking/${ligand}/${result}_protmin.pdbqt --log docking/${ligand}/${result}_min.log --flexdist 6 --minimize

        Attached you will find the input and output files (output files contain the string "min").

        As you can see, in both output files (ligand and protein flexible residues) all residue names have been changed to "UNL".

        I do not know if things would be different if I provide a split protein pdbqt file with flexible and rigid residues. However, letting smina handling this automatically is a lot more convenient when one is dealing with large numbers of proteins and ligands, as it is my case.

        Thank you and kind regards,

        Miro Moman
        RCSI Molecular Medicine
        Dublin, Ireland

         
        • David Koes

          David Koes - 2016-03-29

          The issue here is that OpenBabel's PDBQT writing code eliminates residue
          names (because the flex part only includes side chains, not whole
          residues). We use this code in two place - once at the beginning in
          defining the flexible residues and once at the end to write out a pdbqt
          file, if that is what is requested.

          I've committed a workaround for the first case, so we don't lose residue
          information right off the bat. I've also updated smina.static. If you
          output as a pdbqt you will still get UNL residues, but pdb (no qt)
          output will retain the residue names.

          Hope this helps,

          David Koes
          Assistant Professor
          Computational and Systems Biology
          University of Pittsburgh

          On 03/29/2016 04:33 AM, Miro wrote:

          This would be the command line:

          smina --receptor proteins/${protein}.pdbqt --ligand
          docking/${ligand}/${result}.pdbqt --flexdist_ligand
          docking/${ligand}/${result}.pdbqt --out
          docking/${ligand}/${result}_min.pdbqt --out_flex
          docking/${ligand}/${result}_protmin.pdbqt --log
          docking/${ligand}/${result}_min.log --flexdist 6 --minimize

          Attached you will find the input and output files (output files contain
          the string "min").

          As you can see, in both output files (ligand and protein flexible
          residues) all residue names have been changed to "UNL".

          I do not know if things would be different if I provide a split protein
          pdbqt file with flexible and rigid residues. However, letting smina
          handling this automatically is a lot more convenient when one is dealing
          with large numbers of proteins and ligands, as it is my case.

          Thank you and kind regards,

          Miro Moman
          RCSI Molecular Medicine
          Dublin, Ireland


          No residue information on minimized PDBQT
          https://sourceforge.net/p/smina/discussion/help/thread/feb3d277/?limit=25#c403/445d/d26e


          Sent from sourceforge.net because dkoes@pitt.edu is subscribed to
          https://sourceforge.net/p/smina/discussion/help/

          To unsubscribe from further messages, a project admin can change
          settings at https://sourceforge.net/p/smina/admin/discussion/forums. Or,
          if this is a mailing list, you can unsubscribe from the mailing list.

           
          • Miro

            Miro - 2016-03-29

            Thanks a million! I will download the updated code and give it a try ASAP.

             
          • Miro

            Miro - 2016-03-29

            It works! Thank you. The residue name of the ligand is still modified (which could be an issue if it were a peptide, but it is not the case), however, the flexible residues pdb output preserves the original residue names.

             
  • Miro

    Miro - 2016-03-31

    OK, the problem now is that it removes the protein atom types and the residue numbers. As a consequence, most programs do not understand the protein structure correctly and it is very difficult to merge the flexible residues back into the original structure.

     
  • Miro

    Miro - 2016-04-02

    If at least the residue number could be kept, it would be already useful.

     
  • Miro

    Miro - 2016-04-02

    OK, I have writen a bash script to merge smina's minimised flexible residues back into the orginal pdbqt file to allow for rescoring.

     
  • Miro

    Miro - 2016-04-04

    This is the relevant part of the script (it profits from the fact that the order of the residues and the atoms within each residue, except for the amide atoms, do not change in the input and output files):

    smina --receptor 3HC5_A.pdbqt --ligand docking/${ligand}/${pose}.pdbqt --flexdist_ligand docking/${ligand}/${pose}.pdbqt --out docking/${ligand}/${pose}_min.pdbqt --out_flex docking/${ligand}/${pose}_3HC5_min.pdb --log docking/${ligand}/${pose}_smina.log --flexdist 5 --minimize

    Remerge the smina minimised residues into the original pdbqt file for rescoring

    grep "^Flexible residues:" docking/${ligand}/${pose}_smina.log | sed "s/Flexible residues: //g" | tr ' ' '\n' | cut -d":" -f2 | while read resnumber; do

    grep "^ATOM" 3HC5_A.pdbqt | grep -E "^.{23}${resnumber}" | sed '4d' | sed '1d' | sed "/^.{13}H.*$/d" > docking/${ligand}/split_${resnumber}.pdbqt

    done

    cat docking/${ligand}/split_* > docking/${ligand}/flexible_residues.pdbqt

    rm docking/${ligand}/split_*

    grep "^ATOM" docking/${ligand}/${pose}_3HC5_min.pdb | sed "/^.{13}H.*$/d" > docking/${ligand}/flexible_residues.pdb

    paste <(cut -c 1-27 docking/${ligand}/flexible_residues.pdbqt) <(cut -c 28-54 docking/${ligand}/flexible_residues.pdb) <(cut -c 55-79 docking/${ligand}/flexible_residues.pdbqt) --delimiters '' > docking/${ligand}/flexible_residues_merged.pdbqt

    mv docking/${ligand}/flexible_residues_merged.pdbqt docking/${ligand}/flexible_residues.pdbqt

    rm docking/${ligand}/flexible_residues.pdb

    sed -i 's/^.{26}/&:/' docking/${ligand}/flexible_residues.pdbqt
    sed -e 's/^.{26}/&:/' 3HC5_A.pdbqt > docking/${ligand}/3HC5_A.tmp

    awk -F":" 'NR==FNR{a[$1]=$0;next;}a[$1]{$0=a[$1]}1' docking/${ligand}/flexible_residues.pdbqt docking/${ligand}/3HC5_A.tmp > docking/${ligand}/3HC5_A_flex.pdbqt

    sed -i 's/://g' docking/${ligand}/3HC5_A_flex.pdbqt

    rm docking/${ligand}/flexible_residues.pdbqt docking/${ligand}/3HC5_A.tmp

    babel docking/${ligand}/3HC5_A_flex.pdbqt docking/${ligand}/3HC5_A_flex.pdb -d -p7.4

    rm docking/${ligand}/3HC5_A_flex.pdbqt

    ~/MGLTools-1.5.6/MGLToolsPckgs/AutoDockTools/Utilities24/prepare_receptor4.py -r docking/${ligand}/3HC5_A_flex.pdb -o docking/${ligand}/${pose}_3HC5_min.pdbqt -U nphs

    rm docking/${ligand}/3HC5_A_flex.pdb

     

    Last edit: Miro 2016-04-04
  • Paulette Greenidge

    Hi,
    Has anyone come up with a generic script that allows for the merging of flexible residues with the original PDB ? The beauty of Smina relative to Autodock is that one does not have to prepare separate flexible and rigid protein segments. This simplification would seem to be lost if in order to re-merge the 2 parts this is exactly what one has to do. I cannot figure out a straight forward solution given that several atoms of the original and the flexible residues will overlap with the same coordinates.

    Any smart algorithms out there ?

     
  • David Koes

    David Koes - 2019-02-13

    Hi Paulette,

    Apologies for the delayed response - I've been a bit swamped and am slowly making my way through my email queue. I had thought we had implemented this feature at some point, but apparently not. Unfortunately, the task is made a bit difficult by the fact that the atom names get lost during docking, but I put together a script that is hopefully not too fragile (but does require all files to be PDB format):
    https://github.com/gnina/gnina/blob/master/scripts/makeflex.py

     
  • Paulette Greenidge

    Dear David,

    Thank you very much! I am looking forward to trying the script tomorrow.

     
  • Paulette Greenidge

    Thanks David. I can confirm that your script worked effortlessy and perfectly. For other users, here are some installations that I had to make that are required by the script, but your setup may already have these packages.

    pip install -U ProDy
    pip install biopython
    python -m pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose

     
  • c4asma

    c4asma - 2019-03-06

    Hi David,
    I also just tried out that script makeflex.py, but found that in the final protein structure all oxygen and nitrogens, and some carbon atoms from the flex output are missing, and that I am left with loads of H atoms that are not attached to any other atoms.

     
    • David Koes

      David Koes - 2019-03-06

      Can you try using PDBs instead of PDBQTs? Do you have an example you can provide?

       
  • David Koes

    David Koes - 2019-03-07

    I pushed a bug fix. Not sure how I didn't catch this before. Give it another try.
    81e5ce27064db63dae8df734d8406d60122ffed5

     
  • c4asma

    c4asma - 2019-03-08

    Thank you David, that is working. No heavy atoms are missing anymore.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.