Re: [Rdkit-discuss] a 2D to 3D (smi to sdf) conformer generator python script using rdkit
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Francois B. <ber...@bi...> - 2017-06-15 07:47:23
|
On 06/15/2017 03:50 PM, Greg Landrum wrote: > Thanks for letting people know about this. If we can get a consensus > form that people agree makes sense, this might be a nice addition to > either the RDKit/Scripts directory or the cookbook. > > A couple of smallish comments after a quick skim: > - I would really strongly encourage you to use the ETKDG parameters > (http://pubs.acs.org/doi/abs/10.1021/acs.jcim.5b00654) when doing the > embedding. This really helps a lot with the quality of the conformations > and lets you skip the UFF step. > - The built-in RMSD pruning has improved since JP's article, it may be > worth looking at that. It would be nice if we have a way faster protocol than what I implemented. This protocol (the one from the paper) is super slow due to the RMSD pruning step (not due to UFF). The more conformers/molecule you need, the slower. But it works, at least. The problem if you change the protocol to something more modern is that you have to redo all the statistical validation they did to confirm it works well. Which requires quite some time and motivation. > - If you want to make the embedding step itself robust, it wouldn't be a > bad idea to try switching to random coordinate generation if the initial > embedding fails. Thanks for the comment. I might update this part if I see it fail. Regards, F. > Best, > -greg > > > > On Wed, Jun 14, 2017 at 9:27 AM, Francois BERENGER > <ber...@bi... <mailto:ber...@bi...>> > wrote: > > Hello, > > I gave a try at reproducing the protocol described in: > > @article{DBLP:journals/jcisd/EbejerMD12, > author = {Jean{-}Paul Ebejer and Garrett M. Morris and > Charlotte M. Deane}, > title = {Freely Available Conformer Generation Methods: > How Good Are They?}, > journal = {Journal of Chemical Information and Modeling}, > volume = {52}, > number = {5}, > pages = {1146--1158}, > year = {2012}, > url = {https://doi.org/10.1021/ci2004658 > <https://doi.org/10.1021/ci2004658>}, > doi = {10.1021/ci2004658}, > } > > The resulting script is there: > > https://github.com/UnixJunkie/smi2sdf3d > <https://github.com/UnixJunkie/smi2sdf3d> > > I hope I could reproduce their protocol exactly. > Sorry, my python is so rusty these days. > > Comments and contributions are welcome. > > Even auditing the code for correctness is welcome since it is > doing some scientific computation. > > It is a little bit too slow to my taste. > > You can use it like this to get a max of 10 conformers > per molecule in your input.smi file: > > ./smi2sdf.py 10 input.smi output.sdf > > Best regards, > Francois. > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > <mailto:Rdk...@li...> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss> > > |