From: Martin G. <gue...@po...> - 2015-09-03 10:58:40
|
Sorry my fault. The SmilesGenerator removed the stereochemistry. Just noticed that I have to use .absolute() instead of .unique() Am 03.09.2015 um 12:36 schrieb Martin Gütlein: > Hi John, > > Awesome, thanks. SuppressHydrogens works good, but it suppresses Hs > with stereochem-information as well. Can this be avoided? > > > although possible you should perhaps store and index (unique check) > you structures separately. > Thats what we try to not do, if possible. > > Martin > > > Am 03.09.2015 um 11:32 schrieb John M: >> Hi Martin, >> >> By design CDK separates out standardisation (H rep, tautomers, >> protonation) from canonicalisation (ordering). You've found the >> method to "sprout" hydrogens but you actually want the opposite - >> suppressHydrogens(mol) >> <http://cdk.github.io/cdk/1.5/docs/api/org/openscience/cdk/tools/manipulator/AtomContainerManipulator.html#suppressHydrogens%28org.openscience.cdk.interfaces.IAtomContainer%29>. >> >> I think you might be conflating two things, although possible you >> should perhaps store and index (unique check) you structures >> separately. For your example you could assign a unique tautomer but >> you'll be back at square one with your first example. >> >> O=C[CH]1Cc2[nH]cnc2CC1 -> 1,3 proton shift -> C(=O)[CH]1Cc2c(CC1)[nH]cn2 >> CC(=O)N -> 1,3 proton shift -> CC(O)=N >> >> Thanks, >> >> >> Regards, >> John W May >> joh...@gm... <mailto:joh...@gm...> >> >> On 3 September 2015 at 09:27, Martin Gütlein <gue...@un... >> <mailto:gue...@un...>> wrote: >> >> One more thing: >> I noticed that unique SMILES differentiate explicit and implicit >> Hydrogens, e.g. "[H]Cl" is different form "Cl". This can be >> solved by running >> AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol). >> However, I do not like having my all Hs defined explicitly. Is >> there an option in the CDK to convert explict Hs back to >> implicit, leaving only thoses Hs as explict that are relevant? >> >> Martin >> >> >> Am 03.09.2015 um 09:48 schrieb Martin Gütlein: >>> Hi John, >>> >>> thanks for your reply, I tried to use unique (kekulized) SMILES >>> instead of InChIs. >>> Whats good is that the structure for (most) compounds is stored >>> correctly (i.e., I can create an IAtomContainer that is >>> apparently equal). >>> >>> However, I found an example were the unique SMILES of two >>> identical structures is different (see below). >>> >>> Kind regards, >>> Martin >>> >>> >>> for (String smi : new String[] { "O=C[CH]1Cc2[nH]cnc2CC1", >>> "C(=O)[CH]1Cc2c(CC1)[nH]cn2" }) >>> { >>> IAtomContainer mol = new >>> SmilesParser(SilentChemObjectBuilder.getInstance()).parseSmiles(smi); >>> System.out.println(SmilesGenerator.unique().create(mol)); >>> } >>> >>> >>> >>> >>> >>> >>> Am 02.09.2015 um 20:58 schrieb John M: >>>> Just to add on - if you really want to use InChI (don't) then >>>> you could store the AuxInfo but the CDK doesn't have a >>>> conversion method that accepts it when turning it back into an >>>> AtomContainer. >>>> >>>> I also notice you're using unique SMILES (default by old APIs), >>>> you probably want isomeric that a non-canonical but store >>>> stereochemistry. >>>> >>>> IAtomContainer mol = SmilesGenerator.isomeric().create(container); >>>> >>>> John >>>> >>>> Regards, >>>> John W May >>>> joh...@gm... <mailto:joh...@gm...> >>>> >>>> On 2 September 2015 at 19:54, John M >>>> <joh...@gm...> wrote: >>>> >>>> Hi Martin, >>>> >>>> The InChI is an identifier and not a structure >>>> representation it should never be used as such. For maximum >>>> preservation you should store compounds as Kekulé SMILES or >>>> Molfile. You can store additional data such as coordinates >>>> supplementary to the SMILES. >>>> >>>> You might find a recent presentation by Noel (O Babel) and >>>> Rajarshi (CDK) useful: >>>> http://baoilleach.blogspot.co.uk/2015/08/the-whole-of-cheminformatics-best.html >>>> >>>> John >>>> >>>> >>> >> >> -- >> Dr. Martin Gütlein >> Phone: >> +49 (0)6131 39 23336 <tel:%2B49%20%280%296131%2039%2023336> (office) >> +49 (0)177 623 9499 <tel:%2B49%20%280%29177%20623%209499> (mobile) >> Email: >> gue...@un... <mailto:gue...@un...> >> >> > > -- > Dr. Martin Gütlein > Phone: > +49 (0)6131 39 23336 (office) > +49 (0)177 623 9499 (mobile) > Email: > gue...@un... > -- Dr. Martin Gütlein Phone: +49 (0)6131 39 23336 (office) +49 (0)177 623 9499 (mobile) Email: gue...@un... |