#1294 Aromaticity problem in CDKHueckelAromaticityDetector

cdk-1.4.x
closed
nobody
None
1
2013-12-18
2013-03-20
Patrik Rydberg
No

the atoms in the five ring of isoindole are not correctly perceived when using the smiles below are not recognized as being aromatic and having aromatic bonds. (Tested with CDK 1.4.8)
c2c1ccccc1cn2
a molfile of the same molecule with explicit double and single bonds is correctly perceived
and a smile with explicit bonds as well
C2=C1C=CC=CC1=CN2

Related

Bugs: #1294

Discussion

<< < 1 2 3 (Page 3 of 3)
  • John May
    John May
    2013-03-25

    I'm trying to find the example… but I had this exact case where one was from ChEBI has one from and HMDB. They are referring to the same metabolite but due to the different double bond placement . The point is rather then encoding the specific hybridisation you can encode the geometry instead. As the AtomType.Hybridzation doc states they both have trigonal planar geometry thus using a less specific fingerprint or search you can match these. By InChI standards they are the same - but if you want to tell this in the CDK without running aromatic or tautomerisation algorithms you can do this using the geometry and cut down a lot of search space.

    Anyways, I think we've gone off bug topic :-).
    J

    On 25 Mar 2013, at 11:10, "Egon Willighagen" egonw@users.sf.net wrote:

    According to this presentation, around one bond length:

    http://www.princeton.edu/chemistry/macmillan/group-meetings/DEC_tunneling.pdf

    [bugs:#1294] Aromaticity problem in CDKHueckelAromaticityDetector

    Status: open
    Created: Wed Mar 20, 2013 04:44 PM UTC by Patrik Rydberg
    Last Updated: Mon Mar 25, 2013 11:03 AM UTC
    Owner: nobody

    the atoms in the five ring of isoindole are not correctly perceived when using the smiles below are not recognized as being aromatic and having aromatic bonds. (Tested with CDK 1.4.8)
    c2c1ccccc1cn2
    a molfile of the same molecule with explicit double and single bonds is correctly perceived
    and a smile with explicit bonds as well
    C2=C1C=CC=CC1=CN2

    Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/cdk/bugs/1294/

    To unsubscribe from further messages, please visit https://sourceforge.net/auth/prefs/

     

    Related

    Bugs: #1294

  • On Mon, Mar 25, 2013 at 12:48 PM, John May jwmay@users.sf.net wrote:

    I'm trying to find the example… but I had this exact case where one was from
    ChEBI has one from and HMDB. They are referring to the same metabolite but
    due to the different double bond placement.

    That sounds like pure electron delocalization, but that is something
    else than moving hydrogens...

    The point is rather then
    encoding the specific hybridisation you can encode the geometry instead.

    The geometry should be thought of in terms of electron positions...
    N.planar3 and N.sp2 have different electron placements.

    As the AtomType.Hybridzation doc states they both have trigonal planar geometry
    thus using a less specific fingerprint or search you can match these.

    No, because a fingerprint normally takes into account the full structure.

    The same geometry does not mean the same thing.

    By InChI standards they are the same

    Only because it tries to accomodate tautomerism, because indeed many
    people are OK in finding tautomers. Also note that the InChI tautomers
    rules are not very "good".

    • but if you want to tell this in the CDK
      without running aromatic or tautomerisation algorithms you can do this using
      the geometry and cut down a lot of search space.

    So, you rather match N.sp2 to N.planar3 than N.sp3 to N.planar3? The
    latter two are actually way more similar (electronically, chemically,
    ...)!

    Anyways, I think we've gone off bug topic :-).

    Well, nitrogens just are this complex :)

    Egon

    --
    Dr E.L. Willighagen
    Postdoctoral Researcher
    Department of Bioinformatics - BiGCaT
    Maastricht University (http://www.bigcat.unimaas.nl/)
    Homepage: http://egonw.github.com/
    LinkedIn: http://se.linkedin.com/in/egonw
    Blog: http://chem-bla-ics.blogspot.com/
    PubList: http://www.citeulike.org/user/egonw/tag/papers

     
  • Patrik Rydberg
    Patrik Rydberg
    2013-03-26

    Well, while we're slightly off topic. What's the correct SMARTS to use for such a ring when the hydrogen on the nitrogen atom could be anything? is it c2c1ccccc1c[n*]2 ?

     
  • John May
    John May
    2013-04-12

    Not sure I'm afraid :/. Possibly
    c2c1ccccc1c[n;H1]2

    This expression is for H-pyrole nitrogen - SMART Theory Manual

     
    Last edit: John May 2013-04-12
  • John May
    John May
    2013-12-18

    • status: open --> closed
     
  • John May
    John May
    2013-12-18

    Fixed - SMILES parsed in > 1.5 throws an exception for non-kekulisable molecules.

     
<< < 1 2 3 (Page 3 of 3)