Thread: [Rdkit-discuss] name generator
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Sergio M. C. <ser...@gm...> - 2013-08-27 15:21:22
|
Hi, is there any IUPAC name generator in RDKit? e.g. for transforming "CC(C)O" into "propan-2-ol" ? Many thanks Sergio |
From: Greg L. <gre...@gm...> - 2013-08-27 15:45:57
|
Dear Sergio, On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta <ser...@gm... > wrote: > is there any IUPAC name generator in RDKit? > > e.g. for transforming "CC(C)O" into "propan-2-ol" ? > > There is not. In fact, I'm not aware of any open source structure->name converters. -greg |
From: Sergio M. C. <ser...@gm...> - 2013-08-27 15:54:49
|
thanks Greg, indeed, I only found commercial software for it http://www.chemaxon.com/marvin/help/applications/molconvert.html cheers Sergio On 27 August 2013 16:45, Greg Landrum <gre...@gm...> wrote: > Dear Sergio, > > > On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta < > ser...@gm...> wrote: > >> is there any IUPAC name generator in RDKit? >> >> e.g. for transforming "CC(C)O" into "propan-2-ol" ? >> >> > There is not. In fact, I'm not aware of any open source structure->name > converters. > > -greg > > |
From: Markus H. <mar...@mo...> - 2013-08-27 17:01:12
|
Hi Sergio, here is a solution that uses a free web service offered by the NIH. It's independent of the rdkit but rather slow. Anyway, if you don't need to process too many molecules at a time or if time is not the critical factor maybe it could serve as an intermediate solution: import urllib2 def smi_to_iupac(smi): try: url = 'http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name' iupacName = urllib2.urlopen(url).read() #print iupacName return iupacName except urllib2.HTTPError, e: print "HTTP error: %d" % e.code return None except urllib2.URLError, e: print "Network error: %s" % e.reason.args[1] return None except: print "conversion failed for smiles "+ smi return None smiles = ["CC(O)C","CC(=O)O", "O=C2OCC(=C2\c1ccccc1)\c3ccc(cc3)S(=O)(=O)C"] for s in smiles: print smi_to_iupac(s) returns Propan-2-ol acetic acid 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one By the way, this service offers conversions between many different molecule formats/identifiers. I have used it in the past for CAS number look-up. Best, Markus On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote: > Hi, > > is there any IUPAC name generator in RDKit? > > e.g. for transforming "CC(C)O" into "propan-2-ol" ? > > Many thanks > Sergio > > > ------------------------------------------------------------------------------ > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today! > http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk > > > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: David H. <li...@co...> - 2013-08-27 17:08:07
|
Not sure what software is behind it, but the NCI's Chemical Identifier Resolver may suit your needs. For your example, the URL: http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name returns Propan-2-ol -David On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta <ser...@gm...> wrote: > thanks Greg, > > indeed, I only found commercial software for it > > http://www.chemaxon.com/marvin/help/applications/molconvert.html > > cheers > Sergio > > > On 27 August 2013 16:45, Greg Landrum <gre...@gm...> wrote: >> Dear Sergio, >> >> >> On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta <ser...@gm...> wrote: >>> is there any IUPAC name generator in RDKit? >>> >>> e.g. for transforming "CC(C)O" into "propan-2-ol" ? >> >> There is not. In fact, I'm not aware of any open source structure->name converters. >> >> -greg > > ------------------------------------------------------------------------------ > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today! > http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: George P. <gpa...@gm...> - 2013-08-27 17:25:16
|
I think this is not an actual structure to name converter but a look-up service based on a a predefined dictionary. If this is true, then it won't return anything for any novel/unseen structures. Give it a try and let us know. George. Sent from my giPhone On 27 Aug 2013, at 18:39, David Hall <li...@co...> wrote: > Not sure what software is behind it, but the NCI's Chemical Identifier Resolver may suit your needs. > > For your example, the URL: > > http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name > > returns Propan-2-ol > > -David > > On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta <ser...@gm...> wrote: > >> thanks Greg, >> >> indeed, I only found commercial software for it >> >> http://www.chemaxon.com/marvin/help/applications/molconvert.html >> >> cheers >> Sergio >> >> >> On 27 August 2013 16:45, Greg Landrum <gre...@gm...> wrote: >>> Dear Sergio, >>> >>> >>> On Tue, Aug 27, 2013 at 5:21 PM, Sergio Martinez Cuesta <ser...@gm...> wrote: >>>> is there any IUPAC name generator in RDKit? >>>> >>>> e.g. for transforming "CC(C)O" into "propan-2-ol" ? >>> >>> There is not. In fact, I'm not aware of any open source structure->name converters. >>> >>> -greg >> >> ------------------------------------------------------------------------------ >> Introducing Performance Central, a new site from SourceForge and >> AppDynamics. Performance Central is your source for news, insights, >> analysis and resources for efficient Application Performance Management. >> Visit us today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: Vladimir C. <ch...@gm...> - 2013-08-27 17:40:55
|
Hi, did you tried http://opsin.ch.cam.ac.uk/ ? Vladimir Chupakhin On Tue, Aug 27, 2013 at 6:48 PM, Markus Hartenfeller < mar...@mo...> wrote: > Hi Sergio, > > here is a solution that uses a free web service offered by the NIH. > > It's independent of the rdkit but rather slow. Anyway, if you don't need > to process too many molecules at a time or if time is not the critical > factor maybe it could serve as an intermediate solution: > > > import urllib2 > > def smi_to_iupac(smi): > > try: > url = ' > http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name' > > iupacName = urllib2.urlopen(url).read() > #print iupacName > return iupacName > > except urllib2.HTTPError, e: > print "HTTP error: %d" % e.code > return None > except urllib2.URLError, e: > print "Network error: %s" % e.reason.args[1] > return None > except: > print "conversion failed for smiles "+ smi > return None > > smiles = ["CC(O)C","CC(=O)O", "O=C2OCC(=C2\c1ccccc1)\c3ccc(cc3)S(=O)(=O)C"] > > for s in smiles: > print smi_to_iupac(s) > > > returns > > Propan-2-ol > acetic acid > 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one > > > By the way, this service offers conversions between many different > molecule formats/identifiers. I have used it in the past for CAS number > look-up. > > Best, > Markus > > > On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote: > > Hi, > > is there any IUPAC name generator in RDKit? > > e.g. for transforming "CC(C)O" into "propan-2-ol" ? > > Many thanks > Sergio > > > > ------------------------------------------------------------------------------ > Introducing Performance Central, a new site from SourceForge and > AppDynamics. Performance Central is your source for news, insights, > analysis and resources for efficient Application Performance Management. > Visit us today!http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk > > > > _______________________________________________ > Rdkit-discuss mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |
From: George P. <gpa...@gm...> - 2013-08-27 17:48:46
|
OPSIN wouldn't help very much here, as it deals with the inverse problem, i.e. name to structure. George. Sent from my giPhone On 27 Aug 2013, at 19:40, Vladimir Chupakhin <ch...@gm...> wrote: > Hi, > > did you tried http://opsin.ch.cam.ac.uk/ ? > > Vladimir Chupakhin > > > > On Tue, Aug 27, 2013 at 6:48 PM, Markus Hartenfeller <mar...@mo...> wrote: >> Hi Sergio, >> >> here is a solution that uses a free web service offered by the NIH. >> >> It's independent of the rdkit but rather slow. Anyway, if you don't need to process too many molecules at a time or if time is not the critical factor maybe it could serve as an intermediate solution: >> >> >> import urllib2 >> >> def smi_to_iupac(smi): >> >> try: >> url = 'http://cactus.nci.nih.gov/chemical/structure/'+smi+'/iupac_name' >> >> iupacName = urllib2.urlopen(url).read() >> #print iupacName >> return iupacName >> >> except urllib2.HTTPError, e: >> print "HTTP error: %d" % e.code >> return None >> except urllib2.URLError, e: >> print "Network error: %s" % e.reason.args[1] >> return None >> except: >> print "conversion failed for smiles "+ smi >> return None >> >> smiles = ["CC(O)C","CC(=O)O", "O=C2OCC(=C2\c1ccccc1)\c3ccc(cc3)S(=O)(=O)C"] >> >> for s in smiles: >> print smi_to_iupac(s) >> >> >> returns >> Propan-2-ol >> acetic acid >> 4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one >> >> By the way, this service offers conversions between many different molecule formats/identifiers. I have used it in the past for CAS number look-up. >> >> Best, >> Markus >> >> >> On 08/27/2013 05:21 PM, Sergio Martinez Cuesta wrote: >>> >>> Hi, >>> >>> is there any IUPAC name generator in RDKit? >>> >>> e.g. for transforming "CC(C)O" into "propan-2-ol" ? >>> >>> Many thanks >>> Sergio >>> >>> >>> ------------------------------------------------------------------------------ >>> Introducing Performance Central, a new site from SourceForge and >>> AppDynamics. Performance Central is your source for news, insights, >>> analysis and resources for efficient Application Performance Management. >>> Visit us today! >>> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk >>> >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdk...@li... >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >> ------------------------------------------------------------------------------ >> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! >> Discover the easy way to master current and previous Microsoft technologies >> and advance your career. Get an incredible 1,500+ hours of step-by-step >> tutorial videos with LearnDevNow. Subscribe today and save! >> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdk...@li... >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss |
From: Markus S. <sit...@he...> - 2013-08-27 18:12:43
|
Yes, in this direction (structure to name) the Resolver is only a database lookup, in the other direction (name to structure), it first uses OPSIN (Daniel Lowe's library) which can resolve correct IUPAC names generically, if OPSIN "fails" it does a database lookup, too. Markus >> Not sure what software is behind it, but the NCI's Chemical Identifier >> Resolver may suit your needs. >> >>>> For your example, the URL: >> >>>> http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name >> >> returns Propan-2-ol >> >> -David >> >> On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta >> <ser...@gm...> wrote: >> >>> thanks Greg, |
From: Sergio M. C. <ser...@gm...> - 2013-08-27 20:33:03
|
Oc(:[nH2]):[nH2] does not seem to be in the database http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name molcovert does not generate a name either. On 27 August 2013 18:54, Markus Sitzmann <sit...@he...> wrote: > ** > Yes, in this direction (structure to name) the Resolver is only a database > lookup, > in the other direction (name to structure), it first uses OPSIN (Daniel > Lowe's library) > which can resolve correct IUPAC names generically, if OPSIN "fails" it > does a database > lookup, too. > > Markus > > > Not sure what software is behind it, but the NCI's Chemical Identifier > Resolver may suit your needs. > > For your example, the URL: > > http://cactus.nci.nih.gov/chemical/structure/CC(C)O/iupac_name > > returns Propan-2-ol > > -David > > On Aug 27, 2013, at 11:54 AM, Sergio Martinez Cuesta <ser...@gm...> > wrote: > > thanks Greg, > > > > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Rdkit-discuss mailing list > Rdk...@li... > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > |
From: Greg L. <gre...@gm...> - 2013-08-28 04:54:06
|
On Tue, Aug 27, 2013 at 10:32 PM, Sergio Martinez Cuesta < ser...@gm...> wrote: > Oc(:[nH2]):[nH2] does not seem to be in the database > > http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name > > molcovert does not generate a name either. > That's not actually a stable molecule. it is, at best, a piece of a molecule. OC(N)N works fine with the NCI lookup. What molecule are you trying to name? -greg |
From: Sergio M. C. <ser...@gm...> - 2013-08-28 11:54:33
|
Thanks Greg, I agree, it certainly works for molecules, however I am testing whether cactus is able to provide names to molecular fragments as well. Things like methyl phosphinite (COP) are named after that. See: http://cactus.nci.nih.gov/chemical/structure/COP/iupac_name Do you have any hints on systematically naming molecular fragments? On 28 August 2013 05:53, Greg Landrum <gre...@gm...> wrote: > > On Tue, Aug 27, 2013 at 10:32 PM, Sergio Martinez Cuesta < > ser...@gm...> wrote: > >> Oc(:[nH2]):[nH2] does not seem to be in the database >> >> http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name >> >> molcovert does not generate a name either. >> > > That's not actually a stable molecule. it is, at best, a piece of a > molecule. OC(N)N works fine with the NCI lookup. > What molecule are you trying to name? > > -greg > > |
From: Markus S. <sit...@he...> - 2013-08-28 14:44:47
|
Hi Sergio, there may be random entries in the Resolver database behind http://cactus.nci.nih.gov/chemical/structure but definitely nothing systematic. Markus On Wed, 28 Aug 2013 07:54:26 -0400, Sergio Martinez Cuesta <ser...@gm...> wrote: > Thanks Greg, > > I agree, it certainly works for molecules, however I am testing whether > cactus is able to provide names to molecular >fragments as well. Things > like methyl phosphinite (COP) are named after that. > See: > http://cactus.nci.nih.gov/chemical/structure/COP/iupac_name > > Do you have any hints on systematically naming molecular fragments? > > > > > > > > > On 28 August 2013 05:53, Greg Landrum <gre...@gm...> wrote: >> >> On Tue, Aug 27, 2013 at 10:32 PM, Sergio Martinez Cuesta >> <ser...@gm...> wrote: >>> Oc(:[nH2]):[nH2] does not seem to be in the database >>> >>> http://cactus.nci.nih.gov/chemical/structure/Oc(:[nH2]):[nH2]/iupac_name >>> >>> molcovert does not generate a name either. >> >> That's not actually a stable molecule. it is, at best, a piece of a >> molecule. OC(N)N works fine with the NCI >>lookup. >> What molecule are you trying to name? >> >>>> -greg |