#1177 smiles generator removes aromaticity from aromatic fragments

cdk-1.4.x
open
nobody
5
2013-06-03
2011-09-13
Martin Gütlein
No
0 up votes | 0 down votes | 0%
3 comments

See Mailing List Thread:

http://www.mail-archive.com/cdk-user@lists.sourceforge.net/msg02210.html

Email with problem description and example code:

Hi all,

I would like to mine the MCS and print it as aromatic smarts. Unfortunately, the aromaticity information gets lost, even though the mcs-fragment has arom flags asigned (see code example below).
IMHO, the problem is that the smiles writer (is there a smarts writer?) has its own armaticity detection.

Thanks for helping, regards,
Martin

Example:

output
[[ is mcs atom arom: true is mcs atom arom: true is mcs atom arom: true is mcs atom arom: true is mcs atom arom: true mcs smiles: CCCCC ]]

code
[[
SmilesParser sp = new SmilesParser(DefaultChemObjectBuilder.getInstance());
IAtomContainer mol1 = sp.parseSmiles("c1ccccc1NC");
IAtomContainer mol2 = sp.parseSmiles("c1cccnc1");

    org.openscience.cdk.smsd.Isomorphism mcs = new org.openscience.cdk.smsd.Isomorphism(
            org.openscience.cdk.smsd.interfaces.Algorithm.DEFAULT, true);
    mcs.init(mol1, mol2, true, true);
    mcs.setChemFilters(true, true, true);

    mol1 = mcs.getReactantMolecule();
    IMolecule mcsmolecule = DefaultChemObjectBuilder.getInstance().newInstance(IMolecule.class, mol1);
    List<IAtom> atomsToBeRemoved = new ArrayList<IAtom>();
    for (IAtom atom : mcsmolecule.atoms())
    {
        int index = mcsmolecule.getAtomNumber(atom);
        if (!mcs.getFirstMapping().containsKey(index))
            atomsToBeRemoved.add(atom);
    }

    for (IAtom atom : atomsToBeRemoved)
        mcsmolecule.removeAtomAndConnectedElectronContainers(atom);

    for (int i = 0; i < mcsmolecule.getAtomCount(); i++)
        System.out.println("is mcs atom aromtic: " + mcsmolecule.getAtom(i).getFlag(CDKConstants.ISAROMATIC));

    SmilesGenerator g = new SmilesGenerator();
    g.setUseAromaticityFlag(true);
    System.out.println("mcs smiles: " + g.createSMILES(mcsmolecule));

]]

Discussion

  • just noticed that my ticket reports get auto-assigned to chris steinbeck, changing that to none

     
  • I'll check in my hack to fix this, the proper fix should be a bit more elegant though

     
    Attachments
  • John May
    John May
    2013-06-03

    Okay, i've patched this here /patches/643.

    I think your solution is correct however, the MCS output may not be valid SMILES. This is bad as a robot could pick it up and try and deposite it in PubChem for example. However IMO the SMILES generator should not change the input.

    In your case, an alternative is to store either molecule full, then have flags for which atoms are in the MCS. You'd actually probably want to store the bonds but as an example:

    c1cccnc1 {0,1,2,3,5}
    

    J