I've noticed some possible flaw of flex sidechain code (probably originating in the original Vina code, which I admittedly haven't tested for this). If I do flexible docking using Smina, then afterwards rebuild the receptor with modified flex'ed sidechains, and do a simple rescoring (I used vinardo), the rescored scores deviate quite significantly and seemingly randomly (by up to ±2 kcal/mol) from the original flexible docking scores. I wonder if this could be one of the reasons for the noisiness of the flexible docking scoring that you've mentioned if your Gnina presentation and paper. As for now, it seems I have to do lots of minima when doing flexible docking, and rescore them.
Best regards,
Vis
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In order for the rescoring to have the same scores you need to continue to specify the same residues as flexible. This is because the intramolecular interactions of the flexible sidechains are included in the original score. If you have an example where this is not the case, please provide it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I understand and agree with your evaluation of the what's happening. However, I find the mismatch between the flexible and rigid receptor scores (and therefore pose rankings) troubling. I guess from now on I will be freezing the flex sidechains from the minima and re-running the "reconstituted" rigid dockings...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But there isn't a mismatch if you set the same residues to be flexible. It would be troubling if a system with more interacting atoms in it didn't have a different score.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, in the rigid docking the flex residues are obviously different from flexible docking (zero flex residues vs one), so I the mismatch between scores is understandable. I guess the whole thread could be deleted :) But I will do the "refrozen" sidechain docking anyway because that particular sidechain rotamer im interested in for some ligands in the series somehow is not accessible because it (or rather ligand poses that are modulated by that sidechain rotamer) is higher in energy. That's why the mismatch between "flex" and "reconstituted rigid" sidechain score came up for me. When I do "reconstituted rigid sidechain" docking those lost ligand poses of course show up easily. Thank you for you time and valuable input!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Something's still bothering me. Intra-ligand energies are calculated using different rules than P-L (protein-ligand) interactions (e.g. the term weights seems to be gone, etc.), and I am afraid intra-L energy quality is worse than P-L just because P-L terms are so carefully weighted and so much research has been input into it. I am not sure if I make myself clear or if I am correct. (And flex sidechain interaction with small molecule is intra-ligand from the viewpoint of the program if I understand it correctly).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've tried flexible docking with vina and smina and it seems, that smina does not take into account flexible residues when applying scoring function to the system, as the computed affinity tends to zero when more and more aminoacid residues are considered as flexible. Therefore in order to use flexible docking you should either use Vina, or debug tho source of smina to compile the working version.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can you provide examples with data files? You should expect the predicted energy to go down (more negative) with increased flexibility it should be possible to sample lower energy conformations.
The problem is that the energy goes up and becomes more positive. I've tried gnina in the same conditions and the energy became more negative as it should.
You're going to have to provide an actual reproducible test case for me to look into this further, as I get reasonable results with 6nv1. Have you looked at the output files to see what is going on?
I have redocked the starting models from pdb files and recieved the good results, consisting with the theory and flexible docking was better then rigid. I dont know where to look for, probably usage of pdbqt, made by AutoDockTools, brings some bugs into the computational scheme, implemented in smina.
Last edit: Vadim Shiryaev 2022-11-01
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think this issue is original to Vina. I noticed that Vina scores differed when identical side chain positions were either flexible or fixed. I had an email exchange with either Oleg Trott about this, but I was never satisfied by the explanation. Whether or not a chain is defined as flexible should not logically affect the binding energy - it's artificial that any side chain is fixed in any case. I think it's an artifact of dividing by the degrees of system freedom, encoded in the original software (I don't recall if it's also the case in AutoDock). I've avoided comparing flexres scores with fixed protein scores for that reason.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Merry Christmas!
I've noticed some possible flaw of flex sidechain code (probably originating in the original Vina code, which I admittedly haven't tested for this). If I do flexible docking using Smina, then afterwards rebuild the receptor with modified flex'ed sidechains, and do a simple rescoring (I used vinardo), the rescored scores deviate quite significantly and seemingly randomly (by up to ±2 kcal/mol) from the original flexible docking scores. I wonder if this could be one of the reasons for the noisiness of the flexible docking scoring that you've mentioned if your Gnina presentation and paper. As for now, it seems I have to do lots of minima when doing flexible docking, and rescore them.
Best regards,
Vis
In order for the rescoring to have the same scores you need to continue to specify the same residues as flexible. This is because the intramolecular interactions of the flexible sidechains are included in the original score. If you have an example where this is not the case, please provide it.
I understand and agree with your evaluation of the what's happening. However, I find the mismatch between the flexible and rigid receptor scores (and therefore pose rankings) troubling. I guess from now on I will be freezing the flex sidechains from the minima and re-running the "reconstituted" rigid dockings...
But there isn't a mismatch if you set the same residues to be flexible. It would be troubling if a system with more interacting atoms in it didn't have a different score.
Well, in the rigid docking the flex residues are obviously different from flexible docking (zero flex residues vs one), so I the mismatch between scores is understandable. I guess the whole thread could be deleted :) But I will do the "refrozen" sidechain docking anyway because that particular sidechain rotamer im interested in for some ligands in the series somehow is not accessible because it (or rather ligand poses that are modulated by that sidechain rotamer) is higher in energy. That's why the mismatch between "flex" and "reconstituted rigid" sidechain score came up for me. When I do "reconstituted rigid sidechain" docking those lost ligand poses of course show up easily. Thank you for you time and valuable input!
Something's still bothering me. Intra-ligand energies are calculated using different rules than P-L (protein-ligand) interactions (e.g. the term weights seems to be gone, etc.), and I am afraid intra-L energy quality is worse than P-L just because P-L terms are so carefully weighted and so much research has been input into it. I am not sure if I make myself clear or if I am correct. (And flex sidechain interaction with small molecule is intra-ligand from the viewpoint of the program if I understand it correctly).
I've tried flexible docking with vina and smina and it seems, that smina does not take into account flexible residues when applying scoring function to the system, as the computed affinity tends to zero when more and more aminoacid residues are considered as flexible. Therefore in order to use flexible docking you should either use Vina, or debug tho source of smina to compile the working version.
Can you provide examples with data files? You should expect the predicted energy to go down (more negative) with increased flexibility it should be possible to sample lower energy conformations.
The problem is that the energy goes up and becomes more positive. I've tried gnina in the same conditions and the energy became more negative as it should.
Rigid docking
_ _ _ _ _______
( ____ ( )_ __/( ( /|( )
| ( \/| () () | ) ( | \ ( || ( ) |
| (_ | || || | | | | \ | || () |
(_____ )| |()| | | | | (\ ) || |
) || | | | | | | | \ || ( ) |
/_) || ) ( |) (| ) \ || ) ( |
___)|/ |_____/|/ )_)|/ |
smina is based off AutoDock Vina. Please cite appropriately.
Weights Terms
-0.035579 gauss(o=0,_w=0.5,_c=8)
-0.005156 gauss(o=3,_w=2,_c=8)
0.840245 repulsion(o=0,_c=8)
-0.035069 hydrophobic(g=0.5,_b=1.5,_c=8)
-0.587439 non_dir_h_bond(g=-0.7,_b=0,_c=8)
1.923 num_tors_div
Using random seed: -305716360
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -8.4 0.000 0.000
2 -8.3 2.831 4.121
3 -8.3 2.391 4.321
4 -8.3 1.257 2.837
5 -8.3 1.312 2.017
6 -8.2 2.948 5.756
7 -8.2 2.376 5.480
8 -8.2 2.389 5.434
9 -8.2 2.166 3.611
10 -8.2 2.351 5.309
Flexible docking with smina
_ _ _ _ _______
( ____ ( )_ __/( ( /|( )
| ( \/| () () | ) ( | \ ( || ( ) |
| (_ | || || | | | | \ | || () |
(_____ )| |()| | | | | (\ ) || |
) || | | | | | | | \ || ( ) |
/_) || ) ( |) (| ) \ || ) ( |
___)|/ |_____/|/ )_)|/ |
smina is based off AutoDock Vina. Please cite appropriately.
Weights Terms
-0.035579 gauss(o=0,_w=0.5,_c=8)
-0.005156 gauss(o=3,_w=2,_c=8)
0.840245 repulsion(o=0,_c=8)
-0.035069 hydrophobic(g=0.5,_b=1.5,_c=8)
-0.587439 non_dir_h_bond(g=-0.7,_b=0,_c=8)
1.923 num_tors_div
Flexible residues: A:31 A:37 B:31 B:37 C:31 C:37 D:31 D:37
Using random seed: 2139714014
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -5.0 0.000 0.000
2 -4.9 1.631 3.598
3 -4.8 1.641 3.619
4 -4.8 1.963 3.232
5 -4.8 1.599 2.784
6 -4.7 1.607 2.921
7 -4.7 2.752 5.006
8 -3.8 2.611 4.072
9 -3.3 1.956 2.916
Flexible docking with gnina
()
__ _ _ _
/
| '_ \| | '_ \ / _
|| (| | | | | | | | | (| |
_, || |||| ||_,_|
__/ |
|/
gnina v1.0 HEAD:6381355 Built Mar 6 2021.
gnina is based on smina and AutoDock Vina.
Please cite appropriately.
Commandline: ./gnina -r /mnt/d/smina/protein.pdbqt --config /mnt/d/smina/conf.txt -l /mnt/d/smina/ligand.pdbqt -o /mnt/d/smina/lig_out_gnina.pdbqt --log /mnt/d/smina/log_gnina.txt --out_flex /mnt/d/smina/flex_out_gnina.pdbqt --cnn_scoring none
Flexible residues: A:31 A:37 B:31 B:37 C:31 C:37 D:31 D:37
Using random seed: 575989108
mode | affinity | CNN | CNN
| (kcal/mol) | pose score | affinity
-----+------------+------------+----------
1 -8.98 -1.0000 0.000
2 -8.95 -1.0000 0.000
3 -8.91 -1.0000 0.000
4 -8.86 -1.0000 0.000
5 -8.81 -1.0000 0.000
6 -8.76 -1.0000 0.000
7 -8.73 -1.0000 0.000
8 -8.72 -1.0000 0.000
9 -8.67 -1.0000 0.000
10 -8.66 -1.0000 0.000
The protein was 6NV1
You're going to have to provide an actual reproducible test case for me to look into this further, as I get reasonable results with 6nv1. Have you looked at the output files to see what is going on?
I have redocked the starting models from pdb files and recieved the good results, consisting with the theory and flexible docking was better then rigid. I dont know where to look for, probably usage of pdbqt, made by AutoDockTools, brings some bugs into the computational scheme, implemented in smina.
Last edit: Vadim Shiryaev 2022-11-01
I think this issue is original to Vina. I noticed that Vina scores differed when identical side chain positions were either flexible or fixed. I had an email exchange with either Oleg Trott about this, but I was never satisfied by the explanation. Whether or not a chain is defined as flexible should not logically affect the binding energy - it's artificial that any side chain is fixed in any case. I think it's an artifact of dividing by the degrees of system freedom, encoded in the original software (I don't recall if it's also the case in AutoDock). I've avoided comparing flexres scores with fixed protein scores for that reason.