From: Nikolay K. <ni...@un...> - 2014-03-13 13:17:29
|
Hi, I obtain Null pointer exception when I try to generate the smiles code for a molecule with hydrogen atoms converted from implicit to explicit. Am I missing something in the explicit hydrogens manipulation? The exception is in thrown by the Beem package. I use CDK 1.5.6 Here is the java code: public void testSmilesParser(String smiles) throws Exception { System.out.println("Testing smiles: " + smiles); SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance()); IAtomContainer mol = sp.parseSmiles(smiles); AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol); CDKHydrogenAdder adder = CDKHydrogenAdder.getInstance(SilentChemObjectBuilder.getInstance()); adder.addImplicitHydrogens(mol); HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); SmilesGenerator smiGen = new SmilesGenerator(); String smiles2 = smiGen.create(mol); System.out.println(smiles + " --> " + smiles2); } testSmilesParser("CC"); And here is the output. Testing smiles: CC Exception in thread "main" java.lang.NullPointerException: One or more atoms had an undefined number of implicit hydrogens at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) at org.openscience.cdk.smiles.CDKToBeam.toBeamAtom(CDKToBeam.java:200) at org.openscience.cdk.smiles.CDKToBeam.toBeamGraph(CDKToBeam.java:148) at org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:376) at org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:332) at ambit2.smarts.test.TestUtilities.testSmilesParser(TestUtilities.java:1534) at ambit2.smarts.test.TestUtilities.main(TestUtilities.java:2127) Analogously in the molecule is read from a MOL file the same exception is thrown. If the input smiles is defined with explicit hydrogen there is no problem. testSmilesParser("[H]C([H])([H])C([H])([H])[H]"); outputs: Testing smiles: [H]C([H])([H])C([H])([H])[H] [H]C([H])([H])C([H])([H])[H] --> [H]C([H])([H])C([H])([H])[H] With best regards Nick ---------------------------------------------------------- Dr. Nikolay Kochev University of Plovdiv Department of Analytical Chemistry and Computer Chemistry ---------------------------------------------------------- |
From: John M. <joh...@gm...> - 2014-03-13 17:38:52
|
Hi Nick, This is expected, all implicit hydrogen counts must now be specified when generating SMILES. The hydrogen count is set to null if the atom typer doesn’t know the atom type - usually these are pseudo atoms and some metals. You can either see the nulls to 0, or avoid modifying them. You can still do atom typing but in generally CDKHydrogenAdder.addImplH is only needed if the count is not present. For Mol, Smiles and InChI the hydrogens are present and so the method is not needed. You actually lose information, see the first code snippet here: http://efficientbits.blogspot.nl/2013/12/new-smiles-behaviour-parsing-cdk-154.html Hope that helps, J Here’s a correct version of your code, I should add SmilesParser/Generator are now thread safe so you don’t need a new instance all the time. > public void testSmilesParser(String smiles) throws Exception > { > System.out.println("Testing smiles: " + smiles); > SmilesParser sp = new > SmilesParser(SilentChemObjectBuilder.getInstance()); > IAtomContainer mol = sp.parseSmiles(smiles); > > HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); > > SmilesGenerator smiGen = new SmilesGenerator(); > String smiles2 = smiGen.create(mol); > > System.out.println(smiles + " --> " + smiles2); > } On 13 Mar 2014, at 13:17, Nikolay Kochev <ni...@un...> wrote: > Hi, > > I obtain Null pointer exception when I try to generate the smiles code > for a molecule with hydrogen atoms converted from implicit to explicit. > Am I missing something in the explicit hydrogens manipulation? > The exception is in thrown by the Beem package. > > I use CDK 1.5.6 > Here is the java code: > > > public void testSmilesParser(String smiles) throws Exception > { > System.out.println("Testing smiles: " + smiles); > SmilesParser sp = new > SmilesParser(SilentChemObjectBuilder.getInstance()); > IAtomContainer mol = sp.parseSmiles(smiles); > > AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol); > CDKHydrogenAdder adder = > CDKHydrogenAdder.getInstance(SilentChemObjectBuilder.getInstance()); > adder.addImplicitHydrogens(mol); > HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); > > SmilesGenerator smiGen = new SmilesGenerator(); > String smiles2 = smiGen.create(mol); > > System.out.println(smiles + " --> " + smiles2); > } > > > testSmilesParser("CC"); > > > And here is the output. > > Testing smiles: CC > Exception in thread "main" java.lang.NullPointerException: One or more > atoms had an undefined number of implicit hydrogens > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) > at org.openscience.cdk.smiles.CDKToBeam.toBeamAtom(CDKToBeam.java:200) > at org.openscience.cdk.smiles.CDKToBeam.toBeamGraph(CDKToBeam.java:148) > at > org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:376) > at > org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:332) > at > ambit2.smarts.test.TestUtilities.testSmilesParser(TestUtilities.java:1534) > at ambit2.smarts.test.TestUtilities.main(TestUtilities.java:2127) > > > Analogously in the molecule is read from a MOL file the same exception > is thrown. > > If the input smiles is defined with explicit hydrogen there is no > problem. > > > testSmilesParser("[H]C([H])([H])C([H])([H])[H]"); > > outputs: > Testing smiles: [H]C([H])([H])C([H])([H])[H] > [H]C([H])([H])C([H])([H])[H] --> [H]C([H])([H])C([H])([H])[H] > > > With best regards > Nick > > > ---------------------------------------------------------- > Dr. Nikolay Kochev > University of Plovdiv > Department of Analytical Chemistry and Computer Chemistry > ---------------------------------------------------------- > > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel |
From: John M. <joh...@gm...> - 2014-03-13 20:16:00
|
Ah, right just noticed… custom hydrogen adding. > HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); You need this : https://github.com/cdk/cdk/commit/3d61ec01c584cdef1b8d1ceb9110af56f13b555d Note, the AtomContainerManipulators have also been updated to correctly handle hydrogen suppression / adding accounting for stereoelements. J On 13 Mar 2014, at 17:38, John May <joh...@gm...> wrote: > Hi Nick, > > This is expected, all implicit hydrogen counts must now be specified when generating SMILES. The hydrogen count is set to null if the atom typer doesn’t know the atom type - usually these are pseudo atoms and some metals. You can either see the nulls to 0, or avoid modifying them. You can still do atom typing but in generally CDKHydrogenAdder.addImplH is only needed if the count is not present. For Mol, Smiles and InChI the hydrogens are present and so the method is not needed. > > You actually lose information, see the first code snippet here: http://efficientbits.blogspot.nl/2013/12/new-smiles-behaviour-parsing-cdk-154.html > > Hope that helps, > J > > Here’s a correct version of your code, I should add SmilesParser/Generator are now thread safe so you don’t need a new instance all the time. > >> public void testSmilesParser(String smiles) throws Exception >> { >> System.out.println("Testing smiles: " + smiles); >> SmilesParser sp = new >> SmilesParser(SilentChemObjectBuilder.getInstance()); >> IAtomContainer mol = sp.parseSmiles(smiles); >> >> HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); >> >> SmilesGenerator smiGen = new SmilesGenerator(); >> String smiles2 = smiGen.create(mol); >> >> System.out.println(smiles + " --> " + smiles2); >> } > > On 13 Mar 2014, at 13:17, Nikolay Kochev <ni...@un...> wrote: > >> Hi, >> >> I obtain Null pointer exception when I try to generate the smiles code >> for a molecule with hydrogen atoms converted from implicit to explicit. >> Am I missing something in the explicit hydrogens manipulation? >> The exception is in thrown by the Beem package. >> >> I use CDK 1.5.6 >> Here is the java code: >> >> >> public void testSmilesParser(String smiles) throws Exception >> { >> System.out.println("Testing smiles: " + smiles); >> SmilesParser sp = new >> SmilesParser(SilentChemObjectBuilder.getInstance()); >> IAtomContainer mol = sp.parseSmiles(smiles); >> >> AtomContainerManipulator.percieveAtomTypesAndConfigureAtoms(mol); >> CDKHydrogenAdder adder = >> CDKHydrogenAdder.getInstance(SilentChemObjectBuilder.getInstance()); >> adder.addImplicitHydrogens(mol); >> HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); >> >> SmilesGenerator smiGen = new SmilesGenerator(); >> String smiles2 = smiGen.create(mol); >> >> System.out.println(smiles + " --> " + smiles2); >> } >> >> >> testSmilesParser("CC"); >> >> >> And here is the output. >> >> Testing smiles: CC >> Exception in thread "main" java.lang.NullPointerException: One or more >> atoms had an undefined number of implicit hydrogens >> at >> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) >> at org.openscience.cdk.smiles.CDKToBeam.toBeamAtom(CDKToBeam.java:200) >> at org.openscience.cdk.smiles.CDKToBeam.toBeamGraph(CDKToBeam.java:148) >> at >> org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:376) >> at >> org.openscience.cdk.smiles.SmilesGenerator.create(SmilesGenerator.java:332) >> at >> ambit2.smarts.test.TestUtilities.testSmilesParser(TestUtilities.java:1534) >> at ambit2.smarts.test.TestUtilities.main(TestUtilities.java:2127) >> >> >> Analogously in the molecule is read from a MOL file the same exception >> is thrown. >> >> If the input smiles is defined with explicit hydrogen there is no >> problem. >> >> >> testSmilesParser("[H]C([H])([H])C([H])([H])[H]"); >> >> outputs: >> Testing smiles: [H]C([H])([H])C([H])([H])[H] >> [H]C([H])([H])C([H])([H])[H] --> [H]C([H])([H])C([H])([H])[H] >> >> >> With best regards >> Nick >> >> >> ---------------------------------------------------------- >> Dr. Nikolay Kochev >> University of Plovdiv >> Department of Analytical Chemistry and Computer Chemistry >> ---------------------------------------------------------- >> >> >> >> ------------------------------------------------------------------------------ >> Learn Graph Databases - Download FREE O'Reilly Book >> "Graph Databases" is the definitive new guide to graph databases and their >> applications. Written by three acclaimed leaders in the field, >> this first edition is now available. Download your free book today! >> http://p.sf.net/sfu/13534_NeoTech >> _______________________________________________ >> Cdk-devel mailing list >> Cdk...@li... >> https://lists.sourceforge.net/lists/listinfo/cdk-devel > |
From: Nikolay K. <ni...@un...> - 2014-03-14 06:45:09
|
On 2014-03-13 22:15, John May wrote: > Ah, right just noticed… custom hydrogen adding. > >> HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); > > You need this : > https://github.com/cdk/cdk/commit/3d61ec01c584cdef1b8d1ceb9110af56f13b555d Thank you John I used: AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol); instead of: HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); So if I got it correctly, in CDK 1.5.x this is the right approach to produce a molecule with explicit hydrogens. With best regards Nick |
From: John M. <joh...@gm...> - 2014-03-14 09:05:51
|
Hi Nick, Yep you’ve got it. So with the new ring perception, AtomTyping isn’t as slow as it used to be but for me the main reason I tended to use it was to add hydrogens. Now these are present on inputs I used I only add atom typing for the following reasons; 1) check whether an atom type is known to the CDK, an unknown type could indicate a dodgy molecule 2) hybridisation is needed 3) a method/algorithm needs it For the last point, aromaticity perception and SMARTS both used to need atom typing but don’t anymore. I actually don’t think the SDG needs them either… but I can’t remember. Actually reminds me, you have own SMARTS parsing/matching that uses the SMARTSQueryAtoms? You should check how the SMARTSQueryTool (or SmartsPattern - on the SF patch tracker) does the match now. The semantics of how query atoms check values has changed. Cheers, J On 14 Mar 2014, at 06:45, Nikolay Kochev <ni...@un...> wrote: > > On 2014-03-13 22:15, John May wrote: >> Ah, right just noticed… custom hydrogen adding. >> >>> HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); >> >> You need this : >> https://github.com/cdk/cdk/commit/3d61ec01c584cdef1b8d1ceb9110af56f13b555d > > Thank you John > > I used: > AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol); > > instead of: > HydrogenAdderProcessor.convertImplicitToExplicitHydrogens(mol); > > So if I got it correctly, in CDK 1.5.x this is the right approach to > produce a molecule with explicit hydrogens. > > With best regards > Nick > > > ------------------------------------------------------------------------------ > Learn Graph Databases - Download FREE O'Reilly Book > "Graph Databases" is the definitive new guide to graph databases and their > applications. Written by three acclaimed leaders in the field, > this first edition is now available. Download your free book today! > http://p.sf.net/sfu/13534_NeoTech > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel |
From: Nikolay K. <ni...@un...> - 2014-03-14 10:33:55
|
On 14/03/2014 11:05, John May wrote: > Hi Nick, > > Yep you’ve got it. So with the new ring perception, AtomTyping isn’t as slow as it used to be but for me the main reason I tended to use it was to add hydrogens. Now these are present on inputs I used I only add atom typing for the following reasons; > > 1) check whether an atom type is known to the CDK, an unknown type could indicate a dodgy molecule > 2) hybridisation is needed > 3) a method/algorithm needs it > > For the last point, aromaticity perception and SMARTS both used to need atom typing but don’t anymore. I actually don’t think the SDG needs them either… but I can’t remember. > > Actually reminds me, you have own SMARTS parsing/matching that uses the SMARTSQueryAtoms? You should check how the SMARTSQueryTool (or SmartsPattern - on the SF patch tracker) does the match now. The semantics of how query atoms check values has changed. > > Thanks for the reminding John, Currently Ambit project still depends on CDK 1.4.x, but soon we plan to migrate to CDK 1.5.x, so definitelly I will check out all these issues. Currently we use 1.5.x on a separate branch for some preliminary testing etc. Best regards Nick |