Re: [Rdkit-discuss] Kekulization step in RDKit
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2011-05-25 04:23:04
|
Dear Vlad,
On Tue, May 24, 2011 at 4:47 PM, Vlad Joseph Sykora <js...@mo...> wrote:
> Hi Greg, thanks for the suggestion. Though the following
> m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False)
> Draw.MolToFile(m,'temp.png',kekulize=False)
>
> Still throws as:
>
> ****
> Pre-condition Violation
> getNumImplicitHs() called without preceding call to calcImplicitValence()
> Violation occurred on line 162 in file GraphMol/Atom.cpp
> Failed Expression: d_implicitValence>-1
> ****
>
> The problem is that the CDL SMILES writer doesn't print explicit Hydrogens
> in aromatic structures for saturated heteroatoms. This is because the CDL
> Kekulization algorithm tries all possibilities of double-bond arrangements
> in ring systems, including saturation of heteroatoms (one of the aromaticity
> rules in CDL is that lone pairs of sp3 heteroatoms in ring systems
> contribute to 2 delocalized pi electrons). So "c1c2ccnc2ccc1" is equivalent
> to "c1c2cc[nH]c2ccc1" (which is accepted by RDKit) because saturating the
> Nitrogen in the 5-member ring is the only way the ring system can be
> aromatic (as defined by Hueckel's rule).
As an aside (I answer your question below): Your approach isn't
consistent with most SMILES writers/readers and places a high, easily
avoidable burden on the parsing code. Your code knows that that the H
has to be there, why not just write it out?
For what it's worth: Daylight's code doesn't accept c1cccn1 either,
try entering it here:
http://www.daylight.com/daycgi/depict/
>
> Is there any way RDKit can force the Kekulization (or try the saturation of
> Heteroatoms in ring systems)?
You have a couple of choices.
1) Draw the molecule with aromatic bonds (uses dashed bonds in the svn
version of the code):
>>> from rdkit import Chem
>>> m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False)
>>> m.UpdatePropertyCache()
>>> from rdkit.Chem import Draw
>>> Draw.ShowMol(m,kekulize=False)
2) Try to add the automatically add the explicit Hs. A script with
some python code for doing this was posted last summer
(http://www.mail-archive.com/rdk...@li.../msg01185.html).
Using that code you can do the following:
>>> from rdkit import Chem
>>> from rdkit.Chem import Draw
>>> import sanifix3
>>> m = Chem.MolFromSmiles('c1c2ccnc2ccc1',False)
>>> nm=sanifix3.AdjustAromaticNs(m)
[06:15:40] Can't kekulize mol
>>> Draw.ShowMol(nm)
I'm not a big fan of automatically repairing molecules, but the
approach used in sanifix3 is reasonably robust.
Best,
-greg
|