From: Syed A. R. <as...@eb...> - 2011-09-09 00:37:16
|
Hi Joos, I concur with Nina's view on library vs database search. Just wondering if you had used the right code... https://github.com/asad/SMSD/tree/master/src/org/openscience/smsd/algorithm/vflib/substructure Kind test it with the above code. or use https://github.com/asad/SMSD/blob/master/src/org/openscience/smsd/Substructure.java Asad On 8 Sep 2011, at 10:48, cdk...@li... wrote: > Send Cdk-user mailing list submissions to > cdk...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/cdk-user > or, via email, send a message with subject or body 'help' to > cdk...@li... > > You can reach the person managing the list at > cdk...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Cdk-user digest..." > > > Today's Topics: > > 1. comment about VF2 implementation from chemkit - possible bug > (Joos Kiener) > 2. Re: comment about VF2 implementation from chemkit - possible > bug (Nina Jeliazkova) > 3. Re: inchi generator and valency errors (Sam Adams) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 8 Sep 2011 11:24:41 +0200 > From: Joos Kiener <jo...@su...> > Subject: [Cdk-user] comment about VF2 implementation from chemkit - > possible bug > To: cdk...@li... > Message-ID: > <CAH...@ma...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi all, > > first off i guess you will be hearing more from me rather sooner than later > but now to the actually subject. Please see: > > http://chembioinfo.wordpress.com/2011/03/15/benchmarking-substructure-search/ > > for the context of this message. > > Currently I'm playing around with substructure search ( I have a certain > goal in mind, more on that in later messages). Anyway UIT isn't exactly fast > especially compared to commercial products like ChemFinder or InstantJChem > were searches seem almost instantaneous. > > I was comparing UIT and the above referenced code ported from chemkit. First > the difference in real world usage seems much less extreme than in that > benchmark (for small molecules) or I'm misinterpreting the chart. Anyway in > my case it takes about 60% of the time compared to UIT. > > Now to the subject of the message. I think there is an issue in the ported > version. Following query returns 44 hits with chemkit and 106 with UIT. > ChemFinder also gives 106 hits so I'm inclined to believe 106 is correct. > > Here the Query Molecule: > > CCC(C(CC(C(C)C)C)C)C > > Did not find or check for any other inconsistencies. > > Best Regards, > > Joos Kiener > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 2 > Date: Thu, 8 Sep 2011 12:34:06 +0300 > From: Nina Jeliazkova <jel...@gm...> > Subject: Re: [Cdk-user] comment about VF2 implementation from chemkit > - possible bug > To: Joos Kiener <jo...@su...> > Cc: cdk...@li... > Message-ID: > <CAE5qDd112RU46mk-goCdVVR1-Ni=3Ko...@ma...> > Content-Type: text/plain; charset="utf-8" > > On 8 September 2011 12:24, Joos Kiener <jo...@su...> wrote: > >> Hi all, >> >> first off i guess you will be hearing more from me rather sooner than later >> but now to the actually subject. Please see: >> >> >> http://chembioinfo.wordpress.com/2011/03/15/benchmarking-substructure-search/ >> >> for the context of this message. >> >> Currently I'm playing around with substructure search ( I have a certain >> goal in mind, more on that in later messages). Anyway UIT isn't exactly fast >> especially compared to commercial products like ChemFinder or InstantJChem >> were searches seem almost instantaneous. >> >> > Just to note a comparison between a library method (as UIT) to a database > search is not quite fair, as database search systems usually employ lot of > pre-screening and other optimization techniques. > > e.g. this online search does use CDK ( but not UIT ) > > http://apps.ideaconsult.net:8080/ambit2/query/smarts?type=smiles&search=CCC%28C%28CC%28C%28C%29C%29C%29C%29C&text=&page=0&pagesize=100 > > > Best regards, > Nina > > >> I was comparing UIT and the above referenced code ported from chemkit. >> First the difference in real world usage seems much less extreme than in >> that benchmark (for small molecules) or I'm misinterpreting the chart. >> Anyway in my case it takes about 60% of the time compared to UIT. >> >> Now to the subject of the message. I think there is an issue in the ported >> version. Following query returns 44 hits with chemkit and 106 with UIT. >> ChemFinder also gives 106 hits so I'm inclined to believe 106 is correct. >> >> Here the Query Molecule: >> >> CCC(C(CC(C(C)C)C)C)C >> >> Did not find or check for any other inconsistencies. >> >> Best Regards, >> >> Joos Kiener >> >> >> ------------------------------------------------------------------------------ >> Doing More with Less: The Next Generation Virtual Desktop >> What are the key obstacles that have prevented many mid-market businesses >> from deploying virtual desktops? How do next-generation virtual desktops >> provide companies an easier-to-deploy, easier-to-manage and more affordable >> virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/ >> _______________________________________________ >> Cdk-user mailing list >> Cdk...@li... >> https://lists.sourceforge.net/lists/listinfo/cdk-user >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 3 > Date: Thu, 8 Sep 2011 10:48:40 +0100 > From: Sam Adams <s.e...@gm...> > Subject: Re: [Cdk-user] inchi generator and valency errors > To: Nina Jeliazkova <jel...@gm...> > Cc: Sam Adams <se...@ca...>, cdk...@li... > Message-ID: > <CALiCMJ5ZrZ=Aw3...@ma...> > Content-Type: text/plain; charset="utf-8" > > Hi, > > InChI generation should work with either implicit or explicit hydrogens. > > It looks like there's a bug in the passing of aromatic bonds from CDK to > InChI (do I remember correctly that CDK's aromaticity handling get adjusted > a couple of years ago?). Anyway, delete lines 294-295 from InChIGenerator > should fix the things. > > https://github.com/egonw/cdk/blob/master/src/main/org/openscience/cdk/inchi/InChIGenerator.java#L294: > if (bond.getFlag(CDKConstants.ISAROMATIC)) { > * *order = INCHI_BOND_TYPE.ALTERN; > > I haven't got CDK on my machine at the moment, so it would be quicker for > someone else to makes the changes. > > Cheers, > > Sam > > > On 4 September 2011 07:39, Nina Jeliazkova <jel...@gm...>wrote: > >> >> >> On 4 September 2011 09:25, Egon Willighagen <ego...@gm...>wrote: >> >>> cc:Sam (author of the CDK-InChI bridge) >>> >>> On Sun, Sep 4, 2011 at 7:52 AM, Nina Jeliazkova >>> <jel...@gm...> wrote: >>>> This usually happens, when the molecule does not contain explicit >>>> hydrogens. >>> >>> I was not aware of that. Doesn't sound very useful. We don't have a >>> unit test for this yet, right? That tests for InChI generation for a >>> compound with and without explicit hydrogens, do we? >>> >>> Does this happen for any compound? >>> >> >> Aromatics only. >> >> I came into this issue only recently, when working on metabolite generation >> in Toxtree, haven't tested on a large scale. >> >> >>> >>>> e.g. the test below fails with exactly the same message : "Accepted >>> unusual >>>> valence(s): C(3); Cannot process aromatic bonds" >>>> SmilesParser p = new >>>> SmilesParser(NoNotificationChemObjectBuilder.getInstance()); >>>> IMolecule mol = p.parseSmiles("CN1C=NC2=C1C(=O)N(C(=O)N2C)C"); >>>> /* >>>> CDKHydrogenAdder ha = >>>> >>> CDKHydrogenAdder.getInstance(NoNotificationChemObjectBuilder.getInstance()); >>>> ha.addImplicitHydrogens(mol); >>>> AtomContainerManipulator.convertImplicitToExplicitHydrogens(mol); >>>> */ >>>> InChIGeneratorFactory factory = InChIGeneratorFactory.getInstance(); >>>> InChIGenerator gen = factory.getInChIGenerator(mol); >>>> INCHI_RET ret = gen.getReturnStatus(); >>>> if (ret != INCHI_RET.OKAY) { >>>> throw new Exception(String.format("InChI failed: %s [%s]", >>>> ret.toString(),gen.getMessage())); >>>> } >>>> String inchi = gen.getInchi(); >>>> >>> Assert.assertEquals("InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3", >>>> inchi); >>> >>> I'll try to use this to create a unit test. I'll also make one for >>> methane... wondering how widespread this issue is... >>> >> >> Alkanes work fine. >> >> Nina >> >> >>> >>>> Uncomment the hydrogen adder code and the test will succeed. >>>> I haven't investigated whether the explicit H requirement is the normal >>>> InChI behaviour or something in the cdk-inchi interaction, perhaps >>> others >>>> could help. Otherwise, I agree the current behaviour is not quite >>>> convenient. >>> >>> Sam, do you know what is going on? >>> >>> Egon >>> >>> >>> -- >>> Dr E.L. Willighagen >>> Postdoctoral Researcher >>> Institutet f?r milj?medicin >>> Karolinska Institutet (http://ki.se/imm) >>> Homepage: http://egonw.github.com/ >>> LinkedIn: http://se.linkedin.com/in/egonw >>> Blog: http://chem-bla-ics.blogspot.com/ >>> PubList: http://www.citeulike.org/user/egonw/tag/papers >>> >> >> >> >> ------------------------------------------------------------------------------ >> Special Offer -- Download ArcSight Logger for FREE! >> Finally, a world-class log management solution at an even better >> price-free! And you'll get a free "Love Thy Logs" t-shirt when you >> download Logger. Secure your free ArcSight Logger TODAY! >> http://p.sf.net/sfu/arcsisghtdev2dev >> _______________________________________________ >> Cdk-user mailing list >> Cdk...@li... >> https://lists.sourceforge.net/lists/listinfo/cdk-user >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > ------------------------------------------------------------------------------ > Doing More with Less: The Next Generation Virtual Desktop > What are the key obstacles that have prevented many mid-market businesses > from deploying virtual desktops? How do next-generation virtual desktops > provide companies an easier-to-deploy, easier-to-manage and more affordable > virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/ > > ------------------------------ > > _______________________________________________ > Cdk-user mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-user > > > End of Cdk-user Digest, Vol 64, Issue 5 > *************************************** |