Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#254 MDL readers to deal with D and T atom symbols

Needs_Review
closed
Mark Rijnbeek
master (162)
5
2012-10-28
2010-08-18
Mark Rijnbeek
No

Currently the MDL readers parse atom symbols D (Deuterium) and T (Tritium) into Pseudo atoms with label "D" or "T".
This patch fixes that, D and T resulting in heavy hydrogens with mass 2 or 3.

Discussion

  • Mark, how does that relate to the 'interpretHydrogenIsotopes' IO settings?

    The MDLV2000Reader converts the pseudo atoms into real atoms when that parameter is set, around lines 842-868...

     
  • Mark Rijnbeek
    Mark Rijnbeek
    2010-08-18

    hi Egon, sorry I missed that IO setting. It seems strange though that interpretHydrogenIsotopes is true by default, but I do get Pseudo atoms anyway. I will have a further look, I will alter/ditch this patch.

     
  • Mark Rijnbeek
    Mark Rijnbeek
    2010-08-18

    Attached a new patch, reworked to use the IO setting. In my case the setting did not work because Chembl puts "D" and "T" symbols in their molfiles without a corresponding "M ISO" line. I added two of these Chembl molfiles to the patch and unit tests.
    The method fixHydrogenIsotopes is now a bit more lenient and sets mass number itself for D and T using an IsoptopeFactory.

     
  • Mark, you should use the IChemObjectBuilder pattern, instead of instantiating a particular interface implementations directly. Also, the assert() pattern is first the expected value, then the tested value. Please review the two patches attached.

    Also, in the two news tests, you set the IO property for one, not the other... why is that?

     
  • Rajarshi Guha
    Rajarshi Guha
    2010-08-29

    applied and pushed