#469 proper handling of pseudoatoms in MDL files

Rajarshi Guha
cdk-1.4.x (181)
Rajarshi Guha

I was looking at a bug report where reading an SD file containing pseudoatoms (such as Gln, Ile etc) worked fine, but writing it out crashed with an exception.

The reason for this is that during reading the atom with label Gln, is read in as a PseudoAtom with label "Gln" but symbol "R".

The MDLV2000Writer sees the symbol R for this atom and assumes that it is an R group and then tries to parse the label to get an R-group number. Obviously for this case it fails.

The problem is that all PseudoAtom objects get a symbol of R. While generally OK, this is not always correct for sdf/mol files. There are two possible solutions:

  1. Modify PseudoAtom and related classes to allow one to specify the symbol as well as the label
  2. Modify MDLV2000Reader & MDLV3000Reader to set the symbol for a pseudo atom explicitly to the label.

This patch implements 2


  • Rajarshi, my comments...

    1. can you please add a test for the writer too, showing that something is fixed, like that crash?

    2. Assert.assertTrue(molecule.getAtom(4).getSymbol().equals("Gln"));

    should be:

    Assert.assertEquals("Gln", molecule.getAtom(4).getSymbol());

    there is a second assertTrue that needs a similar replacement.

    1. add the same unit test for the MDLV3000Reader too, please


  • Rajarshi Guha
    Rajarshi Guha

    Added a new patch that addresses all comments - ignore the initial patch. I can't seem to remove the first patch

  • The patch now only applies to master... should we not try to apply this too cdk-1.4.x?

  • Rajarshi, if you agree with application to cdk-1.4.x, please find attached your patch signed off by me, and a patch with fixes, which you need to review.

    If you prefer master only, then let me know, and I'll push your patch to master.

  • Rajarshi Guha
    Rajarshi Guha

    I agree with backporting to 1.4.x - I've applied and pushed them. Can you apply and push to master?

  • I will try to keep doing merges... still worked the last time :)