Re: [Rdkit-discuss] Pg cartridge - mol_to_ctab() and trouble with conformers.
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Jan H. J. <ja...@bi...> - 2014-03-05 18:35:14
|
Hi Greg, Thanks for the explanation. I added this to rdkit_io.c in mol_from_ctab(): + bool keepConformer = PG_GETARG_BOOL(1); - mol = parseMolCTAB(data,false,true); + mol = parseMolCTAB(data,keepConformer,true); and then I can get the expected behavior and have my tests complete successfully. Yes :-). I will go ahead and create a pull request for mol_to_ctab(). The tests for mol_to_ctab() will assume that mol_from_ctab() uses the optional parameter to keep the conformer. Cheers -- Jan On 2014-03-05 13:25, Greg Landrum wrote: > Hi Jan, > > The below behavior is the result of a bug > (https://github.com/rdkit/rdkit/issues/229). > mol_from_ctab() takes an (undocumented) optional argument that is > supposed to determine whether or not the molecule's conformation is > stored in the database. The default is to not store the conformation; > this reduces the size of the database and the speed at which molecules > are depickled. The bug is that even if you try to keep the > conformation the argument is ignored and the conformation is discarded. > > I'll get this fixed tomorrow morning. Alternatively, if you want to > fix it now, the change just needs to be made in the definition of > mol_from_ctab() in rdkit_io.c > > -greg > > > > > On Wed, Mar 5, 2014 at 10:27 AM, Jan Holst Jensen > <ja...@bi... <mailto:ja...@bi...>> wrote: > > Hi, > > About ready to push a changeset for implementing mol_to_ctab(), > but I would like it to play nice and preserve input depictions. > > Ideally I would like the following > > select mol_to_ctab(mol_from_ctab(<input-molfile>)); > > to output a molfile where the coordinates of "input-molfile" are > preserved. > > If I do that in Python it works: > > >>> from rdkit import Chem > >>> m = Chem.MolFromMolBlock("""chiral1.mol > ... ChemDraw04200416412D > ... > ... 5 4 0 0 0 0 0 0 0 0999 V2000 > ... -0.0141 0.0553 0.0000 C 0 0 0 0 0 0 0 0 > 0 0 0 0 > ... 0.8109 0.0553 0.0000 F 0 0 0 0 0 0 0 0 > 0 0 0 0 > ... -0.4266 0.7697 0.0000 Br 0 0 0 0 0 0 0 0 > 0 0 0 0 > ... -0.0141 -0.7697 0.0000 Cl 0 0 0 0 0 0 0 0 > 0 0 0 0 > ... -0.8109 -0.1583 0.0000 C 0 0 0 0 0 0 0 0 > 0 0 0 0 > ... 1 2 1 0 > ... 1 3 1 0 > ... 1 4 1 1 > ... 1 5 1 0 > ... M END""") > >>> m > <rdkit.Chem.rdchem.Mol object at 0x1240980> > *>>> m.GetNumConformers()** > **1* > >>> Chem.MolToMolBlock(m) > 'chiral1.mol\n RDKit 2D\n\n 5 4 0 0 0 0 0 > 0 0 0999 V2000\n -0.0141 0.0553 0.0000 C 0 0 0 0 > 0 0 0 0 0 0 0 0\n 0.8109 0.0553 0.0000 F 0 0 > 0 0 0 0 0 0 0 0 0 0\n -0.4266 0.7697 0.0000 > Br 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0141 -0.7697 > 0.0000 Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -0.8109 > -0.1583 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n 1 > 2 1 6\n 1 3 1 0\n 1 4 1 0\n 1 5 1 0\nM END\n' > >>> quit() > > > In the PG cartridge I lose the conformer of the input. My > implementation looks like this: > > rdkit_io.c: > > PG_FUNCTION_INFO_V1(mol_to_ctab); > Datum mol_to_ctab(PG_FUNCTION_ARGS); > Datum > mol_to_ctab(PG_FUNCTION_ARGS) { > CROMol mol; > char *str; > int len; > > fcinfo->flinfo->fn_extra = SearchMolCache( > fcinfo->flinfo->fn_extra, > fcinfo->flinfo->fn_mcxt, > PG_GETARG_DATUM(0), > NULL, &mol, NULL); > > bool createDepictionIfMissing = PG_GETARG_BOOL(1); > str = makeCtabText(mol, &len, createDepictionIfMissing); > > PG_RETURN_CSTRING( pnstrdup(str, len) ); > } > > > adapter.cpp: > > extern "C" char * > makeCtabText(CROMol data, int *len, bool > createDepictionIfMissing) { > ROMol *mol = (ROMol*)data; > > try { > ereport(NOTICE, > (errcode(ERRCODE_SUCCESSFUL_COMPLETION), > errmsg("mol conformer count = %d", > mol->getNumConformers()))); > > if (createDepictionIfMissing && mol->getNumConformers() == > 0) { > RDDepict::compute2DCoords(*mol); > } > StringData = MolToMolBlock(*mol); > } catch (...) { > ereport(WARNING, > (errcode(ERRCODE_WARNING), > errmsg("makeCtabText: problems converting > molecule to CTAB"))); > StringData=""; > } > > *len = StringData.size(); > return (char*)StringData.c_str(); > } > > > If I run the Python example equivalent from psql: > > postgres=# select mol_to_ctab(mol_from_ctab('chiral1.mol > ChemDraw04200416412D > > 5 4 0 0 0 0 0 0 0 0999 V2000 > -0.0141 0.0553 0.0000 C 0 0 0 0 0 0 0 0 0 > 0 0 0 > 0.8109 0.0553 0.0000 F 0 0 0 0 0 0 0 0 0 > 0 0 0 > -0.4266 0.7697 0.0000 Br 0 0 0 0 0 0 0 0 0 > 0 0 0 > -0.0141 -0.7697 0.0000 Cl 0 0 0 0 0 0 0 0 0 > 0 0 0 > -0.8109 -0.1583 0.0000 C 0 0 0 0 0 0 0 0 0 > 0 0 0 > 1 2 1 0 > 1 3 1 0 > 1 4 1 1 > 1 5 1 0 > M END', false)); > *NOTICE: mol conformer count = 0* > mol_to_ctab > ----------------------------------------------------------------------- > + > RDKit 2D + > + > 5 4 0 0 0 0 0 0 0 0999 > V2000 + > 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 > 0 0 0+ > -1.5000 0.0000 0.0000 F 0 0 0 0 0 0 0 0 0 > 0 0 0+ > -0.0000 -1.5000 0.0000 Br 0 0 0 0 0 0 0 0 0 > 0 0 0+ > 0.0000 1.5000 0.0000 Cl 0 0 0 0 0 0 0 0 0 > 0 0 0+ > 1.5000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 > 0 0 0+ > 1 2 1 6 + > 1 3 1 0 + > 1 4 1 0 + > 1 5 1 0 + > M END + > > (1 row) > > postgres=# > > Something I missed about querying a mol for conformers ? As of now > I lose the input conformer and the code will always output a > calculated-from-scratch depiction. > > Cheers > -- Jan > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move > to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually > works. > Faster operations. Version large binaries. Built-in WAN > optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > <mailto:Rdk...@li...> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |