Menu

#254 MDL readers to deal with D and T atom symbols

Needs_Review
closed
master (162)
5
2012-10-28
2010-08-18
No

Currently the MDL readers parse atom symbols D (Deuterium) and T (Tritium) into Pseudo atoms with label "D" or "T".
This patch fixes that, D and T resulting in heavy hydrogens with mass 2 or 3.

Discussion

  • Egon Willighagen

    Mark, how does that relate to the 'interpretHydrogenIsotopes' IO settings?

    The MDLV2000Reader converts the pseudo atoms into real atoms when that parameter is set, around lines 842-868...

     
  • Mark Rijnbeek

    Mark Rijnbeek - 2010-08-18

    hi Egon, sorry I missed that IO setting. It seems strange though that interpretHydrogenIsotopes is true by default, but I do get Pseudo atoms anyway. I will have a further look, I will alter/ditch this patch.

     
  • Mark Rijnbeek

    Mark Rijnbeek - 2010-08-18

    Attached a new patch, reworked to use the IO setting. In my case the setting did not work because Chembl puts "D" and "T" symbols in their molfiles without a corresponding "M ISO" line. I added two of these Chembl molfiles to the patch and unit tests.
    The method fixHydrogenIsotopes is now a bit more lenient and sets mass number itself for D and T using an IsoptopeFactory.

     
  • Egon Willighagen

    Mark, you should use the IChemObjectBuilder pattern, instead of instantiating a particular interface implementations directly. Also, the assert() pattern is first the expected value, then the tested value. Please review the two patches attached.

    Also, in the two news tests, you set the IO property for one, not the other... why is that?

     
  • Rajarshi Guha

    Rajarshi Guha - 2010-08-29

    applied and pushed

     

Log in to post a comment.