Re: [Rdkit-discuss] Kekulization step in RDKit
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Greg L. <gre...@gm...> - 2011-05-25 04:23:04
|
Dear Vlad, On Tue, May 24, 2011 at 4:47 PM, Vlad Joseph Sykora <js...@mo...> wrote: > Hi Greg, thanks for the suggestion. Though the following > m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False) > Draw.MolToFile(m,'temp.png',kekulize=False) > > Still throws as: > > **** > Pre-condition Violation > getNumImplicitHs() called without preceding call to calcImplicitValence() > Violation occurred on line 162 in file GraphMol/Atom.cpp > Failed Expression: d_implicitValence>-1 > **** > > The problem is that the CDL SMILES writer doesn't print explicit Hydrogens > in aromatic structures for saturated heteroatoms. This is because the CDL > Kekulization algorithm tries all possibilities of double-bond arrangements > in ring systems, including saturation of heteroatoms (one of the aromaticity > rules in CDL is that lone pairs of sp3 heteroatoms in ring systems > contribute to 2 delocalized pi electrons). So "c1c2ccnc2ccc1" is equivalent > to "c1c2cc[nH]c2ccc1" (which is accepted by RDKit) because saturating the > Nitrogen in the 5-member ring is the only way the ring system can be > aromatic (as defined by Hueckel's rule). As an aside (I answer your question below): Your approach isn't consistent with most SMILES writers/readers and places a high, easily avoidable burden on the parsing code. Your code knows that that the H has to be there, why not just write it out? For what it's worth: Daylight's code doesn't accept c1cccn1 either, try entering it here: http://www.daylight.com/daycgi/depict/ > > Is there any way RDKit can force the Kekulization (or try the saturation of > Heteroatoms in ring systems)? You have a couple of choices. 1) Draw the molecule with aromatic bonds (uses dashed bonds in the svn version of the code): >>> from rdkit import Chem >>> m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False) >>> m.UpdatePropertyCache() >>> from rdkit.Chem import Draw >>> Draw.ShowMol(m,kekulize=False) 2) Try to add the automatically add the explicit Hs. A script with some python code for doing this was posted last summer (http://www.mail-archive.com/rdk...@li.../msg01185.html). Using that code you can do the following: >>> from rdkit import Chem >>> from rdkit.Chem import Draw >>> import sanifix3 >>> m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False) >>> nm=sanifix3.AdjustAromaticNs(m) [06:15:40] Can't kekulize mol >>> Draw.ShowMol(nm) I'm not a big fan of automatically repairing molecules, but the approach used in sanifix3 is reasonably robust. Best, -greg |