Re: [Rdkit-discuss] Random structure generator based on chemical formula?
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Jean-Marc N. <jm....@un...> - 2020-06-16 15:40:41
|
Dear all, I used the pyLSD software (http://eos.univ-reims.fr/LSD/JmnSoft/PyLSD/) to solve the problems of isomer generation for the C11H24 and C12H9N formula. The similarity of the two problems in only apparent. The 159 isomers of formula C11H24 were generated within a few seconds. I stopped the resolution process for C12H9N after a few millions of solutions because it could have lasted for days, if not for weeks. Best, Jean-Marc Nuzillard Le 16/06/2020 à 11:44, Joshua Meyers a écrit : > Hey Theo, > > As others have mentioned, this is indeed a non-trivial problem. > One method would be to use a de novo molecular generator with the aim > of recovering these isomers. > > The ability of a generator to generate isomers is actually one of the > benchmarks of a de novo method in GuacaMol. > i.e. how many of the 159 isomers of C11H24 can be recovered by a > method (that target excludes stereochemistry). > https://github.com/BenevolentAI/guacamol/blob/da0917a679f27abdf1d526ebbf84ee6792bac2a4/guacamol/standard_benchmarks.py#L15-L28 > > You may be able to adapt this for your use case? > > Cheers, > Josh > > ...Incidentally, Jan's method is also implemented there :D > > On Mon, 15 Jun 2020 at 12:32, theozh <th...@gm... > <mailto:th...@gm...>> wrote: > > Hello Jan, > > thank you very much for your effort. > It might take a while until I will have digested what you have > implemented. > So far, I don't have Jupyter Notebook installed and I'm running > still RDKit 2019.03 or older. > I'm running RDKit on Windows, but in general, it's might be a good > opportunity to start with Linux. > > best, > Theo. > > Am 14.06.2020 um 12:48 schrieb Jan Halborg Jensen: > > I whipped up something quick and dirty: > https://colab.research.google.com/drive/18esebASwEfPviu-zn9xIs1fwmED-7Yi3?usp=sharing > > > > > >> On 13 Jun 2020, at 10.54, theozh <th...@gm... > <mailto:th...@gm...> <mailto:th...@gm... > <mailto:th...@gm...>>> wrote: > >> > >> Dear RDKit-Community, > >> > >> is there maybe a way with RDKit to generate random (but valid) > molecules with a given chemical sumformula? > >> For example: > >> C12H9N could generate Carbazole as valid compound. > >> The output would be mol or SMILES. > >> > >> I haven't found (yet) anything in this direction in the RDKit > documentation and in the web. > >> But maybe I overlooked some modules, functions or examples > which could be the base for realizing such a random generator? > >> > >> Thank you for any hints, > >> Theo. > >> > > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > <mailto:Rdk...@li...> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Jean-Marc Nuzillard Directeur de Recherches au CNRS Institut de Chimie Moléculaire de Reims CNRS UMR 7312 Moulin de la Housse CPCBAI, Bâtiment 18 BP 1039 51687 REIMS Cedex 2 France Tel : 03 26 91 82 10 Fax : 03 26 91 31 66 http://www.univ-reims.fr/icmr http://eos.univ-reims.fr/LSD/CSNteam.html http://www.univ-reims.fr/LSD/ http://www.univ-reims.fr/LSD/JmnSoft/ |