Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#1296 Atomic Numbers not set for atoms generated through MolecularFormulaManipulator

master
closed
nobody
None
1
2013-09-23
2013-03-28
No

AtomNumbers not set for atoms in atomContainer generated through MolecularFormulaManipulator.getAtomContainer(formula).

Discussion

  • John May
    John May
    2013-03-28

    atomNumbers - atomic numbers

    Creating an atom like so:

    new Atom("C");
    

    will set the atomic number to 6. Formula manipulator does not do this, and instead sets the symbol and leaves the atomic number as null.

     
  • John May
    John May
    2013-03-28

    • summary: AtomNumbers not set for atoms generated through MolecularFormulaManipulator --> Atomic Numbers not set for atoms generated through MolecularFormulaManipulator
     
  • John May
    John May
    2013-04-12

    Kalai, how are you creating the IMolecularFormula? Just looking at the code and actually it looks like it's doing the correct thing.

     
  • I think I can reproduce it with this test (it fails, that is):

    /**
     * @cdk.bug 1296
     */
    @Test
    public void testGetAtomContainer_AddsAtomicNumbers(){
        IMolecularFormula mf2 = new MolecularFormula();
        mf2.addIsotope(builder.newInstance(IIsotope.class,"C"),2);
        mf2.addIsotope(builder.newInstance(IIsotope.class,"H"),4);      
        IAtomContainer ac = MolecularFormulaManipulator.getAtomContainer(
            mf2, builder.newInstance(IAtomContainer.class)
        );      
        Assert.assertEquals(6, ac.getAtomCount());
        Assert.assertNotNull(ac.getAtom(0).getAtomicNumber());
        Assert.assertEquals(2, ac.getAtom(0).getAtomicNumber().intValue());
    }
    
     
  • OK, the problem is that it copies the atomic number from the IIsotope that come from the IMolecularFormule, which are confirmed not to have them set... :/

     
  • Well, as far as I can see, this is the expected behavior... after all:

    new Isotope("C");

    does not set the atomic number either...

    "Fixing" this requires an API change, which I am happy with for master. John, your call :)

    Patch pending...

     
  • Mmmm... it causes some 20 regressions... :(

     
  • OK, here's a bit of the breakdown:

    • hash module -> seems to boil down to 7 hash tests assuming that the atomic number is not set, resulting in some is(not()) tests now failing:

    allenesWithExplicitHydrogens(), figure13a(), figure13b(), figure11(), figure12(), allenes2Dand3D() all from org.openscience.cdk.hash.HashCodeScenarios

    • smarts, fingerprint -> PubChemFingerprinter

    The problem in these two modules was caused by the fact that I had not updated QueryAtom to also set the atomic number.

    OK, rerunning the test suite...

     
  • OK, forget that last comment... not sure what I saw, but adding that line to the QueryAtom does not solve regressions :/

     
  • John May
    John May
    2013-08-02

    Okay, very strange on the hash. There is no encoding of symbols - only the atomic numbers. This was actually how Kalai found the bug.

    Let me double check the hash as this 'souldn't' cause them to fail...

     
  • John May
    John May
    2013-08-02

    Okay I'm on the trail! The hash ones is because the atomic number is set to '1'. I have a sneaking suspicion someone has done this 'new Atom("H")' somewhere. There was a very similar bug in the CML reader.

     
  • John May
    John May
    2013-08-02

    Bazinga!

    The OWLAtomTypeHandler reads all atoms types initially as 'H'. This sets the atomic number as '1' (new behaviour). This then propagates when the atom types are configured.

    Really easy fix is to not initialise it as 'H' :-). I'm sure there's others though where it's done. I have seen it before.

     
  • John May
    John May
    2013-09-23

    • status: open --> closed
     
  • John May
    John May
    2013-09-23

    Applied