#1309 MDLV2000Writer and degenerate bond types

cdk-1.4.x
open
nobody
None
1
2014-08-17
2013-08-05
Duece99
No

Hi,

I've noticed that in this document:

http://infochim.u-strasbg.fr/recherche/Download/Fragmentor/MDL_SDF.pdf

That the MDLV2000 format actually is supposed to support degenerate bond types, like so (as quoted from pate 48):

1 = Single, 2 = Double,
3 = Triple, 4 = Aromatic,
5 = Single or Double,
6 = Single or Aromatic,
7 = Double or Aromatic, 8 = Any

Normally I wouldn't care about using a connection table format for degenerate bonds (I'd just use SMARTS), but for one particular application I actually need to use CTAB-like formats.

In the MDLV2000Writer class, the writeMolecule method seems to support only the first 4 of these:

int bondType;
                if (writeAromaticBondTypes.isSet() && bond.getFlag(CDKConstants.ISAROMATIC))
                    bondType=4;
                else if (bond.getFlag(CDKConstants.SINGLE_OR_DOUBLE) && bond.getFlag(CDKConstants.ISAROMATIC))
                    bondType=4;
                else if (Order.QUADRUPLE  == bond.getOrder())
                    throw new CDKException("MDL molfiles do not support quadruple bonds.");
                else
                    bondType = bond.getOrder().numeric();
                line += formatMDLInt(bondType,3);

Yet I could have sworn that the MDLV2000 reader actually supports all 8. Am I missing something?

In any case I think it'd be cool to have the other 4. That document I quoted does mention that the degenerate things are only to be used in substructure searches but seeing as CDK has that anyway, why not?

Ed.

Related

Bugs: #1309

Discussion

  • Duece99

    Duece99 - 2013-08-05

    Actually on mentioning the MDLV2000Reader, this' the code that handles bond types in the readAtomContainer method:

    if (order >= 1 && order <= 3) {
                        IBond.Order cdkOrder = IBond.Order.SINGLE;
                        if (order == 2) cdkOrder = IBond.Order.DOUBLE;
                        if (order == 3) cdkOrder = IBond.Order.TRIPLE;
                        if (stereo != null) {
                            newBond = molecule.getBuilder().newInstance(IBond.class,a1, a2, cdkOrder, stereo);
                        } else {
                            newBond = molecule.getBuilder().newInstance(IBond.class,a1, a2, cdkOrder);
                        }
                    } else if (order == 4) {                
                        // aromatic bond                    
                        if (stereo != null) {
                            newBond = molecule.getBuilder().newInstance(IBond.class,a1, a2, IBond.Order.SINGLE, stereo);
                        } else {
                            newBond = molecule.getBuilder().newInstance(IBond.class,a1, a2, IBond.Order.SINGLE);
                        }
                        // mark both atoms and the bond as aromatic and raise the SINGLE_OR_DOUBLE-flag
                        newBond.setFlag(CDKConstants.SINGLE_OR_DOUBLE, true);
                        newBond.setFlag(CDKConstants.ISAROMATIC, true);
                        a1.setFlag(CDKConstants.ISAROMATIC, true);
                        a2.setFlag(CDKConstants.ISAROMATIC, true);
                    } else if (order == 8) {
                        continue;
                    } else {
                        queryBondCount++;
                        newBond = new CTFileQueryBond(molecule.getBuilder());
                        IAtom[] bondAtoms = {a1,a2};
                        newBond.setAtoms(bondAtoms);
                        newBond.setOrder(null);
                        CTFileQueryBond.Type queryBondType=null;
                        switch (order) {
                            case 5: queryBondType = CTFileQueryBond.Type.SINGLE_OR_DOUBLE; break;
                            case 6: queryBondType = CTFileQueryBond.Type.SINGLE_OR_AROMATIC; break;
                            case 7: queryBondType = CTFileQueryBond.Type.DOUBLE_OR_AROMATIC; break;
                        }
                        ((CTFileQueryBond)newBond).setType(queryBondType);
                        newBond.setStereo(stereo);
                    }
    

    I see that bond "order" 8 is skipped out, when it should actually represent "any bond type".

    Ed.

     
  • John May

    John May - 2013-08-05

    Hi Ed,

    Yet I could have sworn that the MDLV2000 reader actually supports all 8. Am I missing something?

    Yes but CDK doesn't/can't represent them (at least not outside of SMARTS). Bonds 4,5,6,7,8 are only for query objects which isn't in the scope of the current MDL writer. The query bond '4' is typically used incorrectly for non-query structures which is why it's supported in this case.

    I have a fix for this similar to SMILES which I'll discuss in future.

    J

    P.s. official MDL documentation - http://accelrys.com/products/informatics/cheminformatics/ctfile-formats/no-fee.php

    On 5 Aug 2013, at 10:06, Duece99 duece99@users.sf.net wrote:

    [bugs:#1309] MDLV2000Writer and degenerate bond types

    Status: open
    Created: Mon Aug 05, 2013 09:06 AM UTC by Duece99
    Last Updated: Mon Aug 05, 2013 09:06 AM UTC
    Owner: nobody

    Hi,

    I've noticed that in this document:

    http://infochim.u-strasbg.fr/recherche/Download/Fragmentor/MDL_SDF.pdf

    That the MDLV2000 format actually is supposed to support degenerate bond types, like so (as quoted from pate 48):

    1 = Single, 2 = Double,
    3 = Triple, 4 = Aromatic,
    5 = Single or Double,
    6 = Single or Aromatic,
    7 = Double or Aromatic, 8 = Any

    Normally I wouldn't care about using a connection table format for degenerate bonds (I'd just use SMARTS), but for one particular application I actually need to use CTAB-like formats.

    In the MDLV2000Writer class, the writeMolecule method seems to support only the first 4 of these:

    int bondType;
    if (writeAromaticBondTypes.isSet() && bond.getFlag(CDKConstants.ISAROMATIC))
    bondType=4;
    else if (bond.getFlag(CDKConstants.SINGLE_OR_DOUBLE) && bond.getFlag(CDKConstants.ISAROMATIC))
    bondType=4;
    else if (Order.QUADRUPLE == bond.getOrder())
    throw new CDKException("MDL molfiles do not support quadruple bonds.");
    else
    bondType = bond.getOrder().numeric();
    line += formatMDLInt(bondType,3);
    Yet I could have sworn that the MDLV2000 reader actually supports all 8. Am I missing something?

    In any case I think it'd be cool to have the other 4. That document I quoted does mention that the degenerate things are only to be used in substructure searches but seeing as CDK has that anyway, why not?

    Ed.

    Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/cdk/bugs/1309/

    To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

     

    Related

    Bugs: #1309

  • Duece99

    Duece99 - 2013-08-05

    Hi,

    If you want I can just code in support for the other 4 bond types for both? Should be dead easy given the code I've quoted - just use the SMSD/AMBIT SMARTS bond types?

    Ed.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks