Menu

#1195 some descriptors fail to handle aromaticity correctly

open
nobody
9
2012-10-08
2011-12-21
No

A set of descriptors does not work correctly for the two types of aromaticity representation e.g. c1ccccc1 and C1=CC=CC=C1 give quite different result values for the descriptor.
Here are listed 3 examples and the the list of all descriptors that fail:

Comparing descriptor values for c1ccccc1 and C1=CC=CC=C1
XLogP 4.062 2.082
C2SP2 0.0 6.0
C2SP3 6.0 0.0
ALogP 0.0 1.4033999999999995
ALogp2 0.0 1.9695315599999987
AMR 0.0 30.955799999999996
BCUTc-1l -0.18066486607184457 -0.21175815206924314
BCUTc-1h 0.11938953556395132 0.08829624956655291
MolIP 0.0 8.7360121

Comparing descriptor values for c1ccccc1CCC(CC)CCN and C1=CC=CC=C1CCC(CC)CCN
LipinskiFailures 1.0 0.0
XLogP 5.906 3.8899999999999997
ATSc1 0.02963300936257801 0.030880312144664857
ATSc2 -0.012090061482428298 -0.012341625539837791
ATSc3 -0.0025292327510609384 -0.0026466176367462594
ATSc4 -1.8482519638761126E-4 7.312021157925057E-4
ATSc5 -1.2229491464390043E-5 0.001777028789656713
C2SP2 0.0 5.0
C3SP2 0.0 1.0
C2SP3 9.0 4.0
C3SP3 2.0 1.0
ALogP -1.2510999999999997 -0.04330000000000056
ALogp2 1.5652512099999991 0.0018748900000000485
AMR 32.7762 62.511900000000004
BCUTc-1h 0.027581097463993814 0.027504421178008526
MolIP 9.78147624160733 8.199589395397101

Comparing descriptor values for c1ccccc1CCC(CC)CCNc2ccccc2 and C1=CC=CC=C1CCC(CC)CCNC2=CC=CC=C2
XLogP 8.838999999999999 5.454000000000001
ATSc1 0.062282611216200066 0.05057222131722536
ATSc2 -0.03115211769438775 -0.025073172136074226
ATSc3 -0.004660620187309899 -0.003476970136158749
ATSc4 0.003572779513132754 0.003487608390774221
ATSc5 9.70815694659988E-4 0.0018642068852909237
C2SP2 0.0 11.0
C3SP2 0.0 1.0
C2SP3 15.0 4.0
C3SP3 2.0 1.0
ALogP -0.9215999999999995 1.2428000000000012
ALogp2 0.8493465599999991 1.5445518400000031
AMR 31.8541 91.87329999999999
BCUTc-1l -0.30846158469250606 -0.2884631663210181
BCUTc-1h 0.16905183889116193 0.13855538667368522
MolIP 8.496508028720328 5.114511566145805

And here is given the structure pre-processing

void processStructure(IAtomContainer ac) throws Exception
{
clearAromaticityFlags(ac);
AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(ac);
CDKHydrogenAdder adder = CDKHydrogenAdder.getInstance(SilentChemObjectBuilder.getInstance());
adder.addImplicitHydrogens(ac);
//AtomContainerManipulator.convertImplicitToExplicitHydrogens(ac);

    CDKHueckelAromaticityDetector.detectAromaticity(ac);
}

Discussion

  • Nikolay Kochev

    Nikolay Kochev - 2011-12-21

    In my opinion this bug is serious since the descriptors are used for very important tasks. For example XLogP gives the 'better' value (2.082) for C1=CC=CC=1 representation and quite erroneous value (4.062) for c1ccccc1 representation of benzene. The experimental LogP value for benzene is 2.13.
    This is a big problem because majority of people would prefer the aromatic form instead of Kekule one. It seems that this problem can be seen in many descriptors.

     
  • Egon Willighagen

    Agreed. This is important. If that preprocessing has taken place, these descriptors should give the same values, no matter what SMILES input was given...

    Since the various descriptors have different authors, I suggest to split up this bug report in separate reports for the various descriptor classes...