#1300 SmilesGenerator: charged, aromatic planar3 nitorgen has spurious brackects

cdk-1.4.x
closed
nobody
None
1
2013-06-03
2013-05-30
Joos Kiener
No

SmilesGenerator has a bug when the molecule contains a charged, aromatic planar3-nitrogen and useAromaticity is set to true.

Change line 1663 (approximately) from

if (a.getSymbol().equals("N") && a.getHybridization() == IAtomType.Hybridization.PLANAR3 && container.getConnectedAtomsList(a).size() != 3) {

To:

if (a.getSymbol().equals("N") && a.getHybridization() == IAtomType.Hybridization.PLANAR3 && container.getConnectedAtomsList(a).size() != 3 && a.getFormalCharge() == 0) {

eg. add additional check for atom charge:

a.getFormalCharge() == 0

Molecule ZINC58167940 can be used for testing (see attachement.

1 Attachments

Discussion

  • Egon Willighagen

    • summary: SmilesGenerator: charged, aromatic planar3 nitorgen gets explicit H --> SmilesGenerator: charged, aromatic planar3 nitorgen has spurious brackects
     
  • Egon Willighagen

    To be clear, the explicit hydrogen on the lower case nitrogen is a must; that's not the bug.

    However, it does currently not correctly integrate this with the charge information causing the double []...

    So, rather than outputting:

    [[nH]-]

    ... it should output instead:

    [nH-]

     
  • Joos Kiener

    Joos Kiener - 2013-05-30

    First of I'm not a chemist so what I say might be wrong.

    ZINC lists Cc1cc(c(n1c2[n-]ncn2)C)c3csc(n3)CCCOC as SMILES for this molecule and that smiles can be converted back to the same image as displayed in ZINC using as example the identifier resolver.

    In the molfile the nitrogen has no bond to any hydrogen and they are explicitly in the file. So I think the H does not belong there and [n-] is correct for this specific molecule? (or [nH] with no charge)

    EDIT: It is charged in ZINC due to reference pH used

    http://zinc.docking.org/substance/58167940

    so [n-] seems to be correct fro this specific case or [nH] uncharged at different pH.

     
    Last edit: Joos Kiener 2013-05-30
    • Egon Willighagen

      You're right. There were two issues: one was the double brackets, but it should not have a hydrogen anyway...

       
  • Egon Willighagen

    I created a patch for the two issues: https://sourceforge.net/p/cdk/patches/641/

    The first two patches fix the double bracket problem, and the second is a variation of the patch suggested by Joos. Besides testing of the charge is 0, we should also test if it is unset (null) and so that we can assume it is zero.

     
  • John May

    John May - 2013-06-03
     
  • John May

    John May - 2013-06-03

    Fixed by patch 641

     
  • John May

    John May - 2013-06-03
    • status: open --> closed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks