#272 Implementation of Acidic and Basic group count descriptors

Rajarshi Guha
cdk-1.4.x (181)
Egon Willighagen

For feature request #3056330.


  • Rajarshi Guha
    Rajarshi Guha

    Please include sources (if any) for the SMARTS patterns

    In the acidic groups descriptor it's not clear what $(n1nnnc1) and $(n1nncn1) are supposed to match. Could you include an example in the Javadocs (or bring the Javadocs in sync with the code as these are commented out)

    A few more comprehensive (ie bigger target SMILES) tests would be useful

    otherwise looks good

  • Rajarshi, thanx for your review. The SMARTS are derived from JOELib, per the feature request, but will add that information. I also left a question on BOx [0]. Unfortunately, the JOELib JavaDoc is not more informative about the source of those SMARTS. I assume those two ring system are corner cases, and will fix the JavaDoc.

    Regarding the larger molecule, I can add a e.g. this:


    Is that what you have a in mind? A larger, curated test set would of course be ideal, but I am not aware of such.


  • Rajarshi Guha
    Rajarshi Guha

    wrt sources, I see. Well if we can determine the sources that's fine.

    For the test case, yes, I was just thinking of a few bigger molecules - not necessarily a large curated set. Just something more meaty than HCN :)

  • Rajarshi Guha
    Rajarshi Guha

    has the patch been updated?

  • Halfway through the updates, but got a bit distracted with finishing up in Uppsala...

  • Rajarshi Guha
    Rajarshi Guha

    Any udates on this?

  • Rebased on top of cdk-1.4.x and with these fixes:

    • fixed compiling by adding missing @cdk.module annotation for test classes
    • added info in JavaDoc on the source of the SMARTS
    • fixed unit tests to reflect that symmetrical SMARTS can be found twice (though not unique, but the SMARTSTool does not correct for that)
    • added one unit test for a larger molecule, with two different acidic groups

    I think in due time this descriptor can be replaced by a better one, but for now it does implement the feature request.

  • Rajarshi Guha
    Rajarshi Guha

    the test class annotation on AcidicGroupCountDescriptor.java and BasicGroupCountDescriptor is wrong

    is it actually useful to have multiple SQT's rather than loooping over the strings?

    otherwise looks good

  • New patch attached:

    • fixes the annotation
    • adds the missing constructor unit tests
    • adds a complex unit test for the BasicGroupCount descriptor
  • Rajarshi Guha
    Rajarshi Guha

    Applied and pushed