From: Joos K. <jo...@su...> - 2012-11-09 10:19:54
|
Hi all, I have a AtomContainer with disconnected structures. It is created from separate connected AtomContainers but this is for export of Mixtures to sdf. The issue is to create and sdf file were all the disconnected structures are nicely displayed. Running StructureDiagrammGenerator only works for connected structures and GeometryTools also don't seem to offer a out of the box solution. Is there any tool in cdk that can do this? thanks! |
From: Egon W. <ego...@gm...> - 2012-11-09 10:34:49
|
Hi Joos, On Fri, Nov 9, 2012 at 11:19 AM, Joos Kiener <jo...@su...> wrote: > The issue is to create and sdf file were all the disconnected structures are > nicely displayed. Running StructureDiagrammGenerator only works for > connected structures and GeometryTools also don't seem to offer a out of the > box solution. > > Is there any tool in cdk that can do this? The ConnectivityTool can be used to split disconnected structures into a list of "molecules". Because some discussion in the bug tracker came up around this too, let's set some definitions. A 'molecule' is a chemical graph, where each atom can be reached from another atom via one or more bonds. A salt is a disconnected structure, consisting of two or more 'molecules'. The SDG tool in the CDK is an algorithm to layout 'molecules', and thus fails on salts. That is not a bug, just a limitation of the algorithm. To work around this, you can split up the structure first, then do SDG on each molecule. Now, to combine that again, you need to do some tricks, one of which is to decide how you would put them in one diagram. For example as a table of structures. We used to have code for that in the CDK, and I used that in the past to lay out multiple reactions. That said, I love an algorithm in the CDK to lay out salts... if we introduce a dative bond concept in 'master', then the current SDG algorithm may actually work. We do not have such a bond right now, and it will require code updated all over the place, but I'm happy to see that. Then, we'd only need an algorithm to take a 'salt' and connect positive atoms with negative atoms... (there may be multiple charges, etc...) OK, let me update the atom type perception in master first, and then I will try to look at the dative bond idea. Egon -- Dr E.L. Willighagen Postdoctoral Researcher Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers |
From: Nina J. <jel...@gm...> - 2012-11-09 10:51:15
|
Hi Joos, I think I have posted this code before on a similar question. https://ambit.svn.sourceforge.net/svnroot/ambit/trunk/ambit2-all/ambit2-rendering/src/main/java/ambit2/rendering/CompoundImageTools.java It will render any set of disconnected structures inside a single atom container. Online demo how it works at http://apps.ideaconsult.net:8080/ambit2/depict/cdk?search=CCCCCC.CCC.CC The jar is ambit2-rendering, available at Maven repository http://ambit.uni-plovdiv.bg:8083/nexus/index.html#nexus-search;quick~ambit2-rendering or at sourceforge http://sourceforge.net/projects/ambit/files/Ambit2/AMBIT_modules/2.4.8/ Best regards, Nina On 9 November 2012 12:34, Egon Willighagen <ego...@gm...>wrote: > Hi Joos, > > On Fri, Nov 9, 2012 at 11:19 AM, Joos Kiener <jo...@su...> wrote: > > The issue is to create and sdf file were all the disconnected structures > are > > nicely displayed. Running StructureDiagrammGenerator only works for > > connected structures and GeometryTools also don't seem to offer a out of > the > > box solution. > > > > Is there any tool in cdk that can do this? > > The ConnectivityTool can be used to split disconnected structures into > a list of "molecules". > > Because some discussion in the bug tracker came up around this too, > let's set some definitions. > A 'molecule' is a chemical graph, where each atom can be reached from > another atom via one or more bonds. > A salt is a disconnected structure, consisting of two or more 'molecules'. > > The SDG tool in the CDK is an algorithm to layout 'molecules', and > thus fails on salts. That is not a bug, just a limitation of the > algorithm. > > To work around this, you can split up the structure first, then do SDG > on each molecule. > > Now, to combine that again, you need to do some tricks, one of which > is to decide how you would put them in one diagram. For example as a > table of structures. We used to have code for that in the CDK, and I > used that in the past to lay out multiple reactions. > > That said, I love an algorithm in the CDK to lay out salts... if we > introduce a dative bond concept in 'master', then the current SDG > algorithm may actually work. We do not have such a bond right now, and > it will require code updated all over the place, but I'm happy to see > that. Then, we'd only need an algorithm to take a 'salt' and connect > positive atoms with negative atoms... (there may be multiple charges, > etc...) > > OK, let me update the atom type perception in master first, and then I > will try to look at the dative bond idea. > > Egon > > > -- > Dr E.L. Willighagen > Postdoctoral Researcher > Department of Bioinformatics - BiGCaT > Maastricht University (http://www.bigcat.unimaas.nl/) > Homepage: http://egonw.github.com/ > LinkedIn: http://se.linkedin.com/in/egonw > Blog: http://chem-bla-ics.blogspot.com/ > PubList: http://www.citeulike.org/user/egonw/tag/papers > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_nov > _______________________________________________ > Cdk-user mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-user > |
From: <ra...@ar...> - 2012-11-16 10:25:32
|
On Fri, Nov 09, 2012 at 11:34:18AM +0100, Egon Willighagen wrote: > The ConnectivityTool can be used to split disconnected structures into > a list of "molecules". > > Because some discussion in the bug tracker came up around this too, > let's set some definitions. > A 'molecule' is a chemical graph, where each atom can be reached from > another atom via one or more bonds. > A salt is a disconnected structure, consisting of two or more 'molecules'. You would then have to add that the above defined molecule has nothing to do with CDK's Molecule class which is, fortunately for your definition, phased out. |
From: Egon W. <ego...@gm...> - 2012-11-16 10:44:49
|
On Fri, Nov 16, 2012 at 11:16 AM, <ra...@ar...> wrote: > You would then have to add that the above defined molecule has > nothing to do with CDK's Molecule class which is, fortunately for > your definition, phased out. Yeah, the CDK 'Molecule' was aimed at fully connected graphs... but that was just too confusing for people, the SMILESParser indeed as a very nice example... Egon -- Dr E.L. Willighagen Postdoctoral Researcher Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers |
From: <ra...@ar...> - 2012-11-16 11:27:29
|
John, Egon, and me seem to have come to an optimal solution regarding a long-standing issue, i.e., why the user has to call isConnected (doing an unnecessary graph walk) and then partitionIntoMolecules() to separate parts of a Smiles molecule, when the parts could be easily gotten from the parser itself. There will soon be an additional accessor to the parser returning an AtomContainerSet. This will also make it easy to enable display disconnected Smiles in JCP. (It wasn't difficult before, it just wasn't implemented) Since this also concerns a user's request, I have set Cc accordingly. Regards, Ralf Stephan |
From: Nina J. <jel...@gm...> - 2012-11-16 11:59:40
|
On 16 November 2012 13:18, <ra...@ar...> wrote: > John, Egon, and me seem to have come to an optimal solution > regarding a long-standing issue, i.e., why the user has to > call isConnected (doing an unnecessary graph walk) and then > partitionIntoMolecules() to separate parts of a Smiles molecule, > when the parts could be easily gotten from the parser itself. > There will soon be an additional accessor to the parser returning > an AtomContainerSet. This will also make it easy to enable > display disconnected Smiles in JCP. (It wasn't difficult before, > it just wasn't implemented) > > Since this also concerns a user's request, I have set Cc accordingly. > > If the suggestion is for the SMILES parser to return AtomContainerSet instead of a single molecule, I believe it is not the right approach. It might be useful for visualisation purposes, but not for calculations (yes, there are prediction algorithms who take a 'disconnected' salt). Best regards, Nina > Regards, > Ralf Stephan > > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel > |
From: <ra...@ar...> - 2012-11-16 14:41:56
|
On Fri, Nov 16, 2012 at 01:59:33PM +0200, Nina Jeliazkova wrote: > On 16 November 2012 13:18, <ra...@ar...> wrote: > > There will soon be an additional accessor to the parser returning > > an AtomContainerSet. This will also make it easy to enable > > > If the suggestion is for the SMILES parser to return AtomContainerSet > instead of a single molecule, I believe it is not the right approach. It > might be useful for visualisation purposes, but not for calculations (yes, > there are prediction algorithms who take a 'disconnected' salt). See the two lines above ;) I'm not sure if a query function would make a differences /wrt thread safety comparing to what we have. I have to read about that more. |
From: Nina J. <jel...@gm...> - 2012-11-16 15:55:35
|
On 16 November 2012 16:32, <ra...@ar...> wrote: > On Fri, Nov 16, 2012 at 01:59:33PM +0200, Nina Jeliazkova wrote: > > On 16 November 2012 13:18, <ra...@ar...> wrote: > > > There will soon be an additional accessor to the parser returning > > > an AtomContainerSet. This will also make it easy to enable > > > > > If the suggestion is for the SMILES parser to return AtomContainerSet > > instead of a single molecule, I believe it is not the right approach. It > > might be useful for visualisation purposes, but not for calculations > (yes, > > there are prediction algorithms who take a 'disconnected' salt). > > See the two lines above ;) > > OK, if it is an alternative one, then fine :) imho, the best solution for the disconnected layout issue is just to provide a a class/method to do the layout, rather than fixing readers. The SMILES assessor is not a generic solution; disconnected structures may come from a variety of formats or calculations; does it mean SDF, CML, etc readers should have the same type of accessors ? I might be missing something of course. Regards, Nina > I'm not sure if a query function would make a differences /wrt > thread safety comparing to what we have. I have to read about > that more. > > |
From: <ra...@ar...> - 2012-11-16 15:12:29
|
> I'm not sure if a query function would make a differences /wrt > thread safety comparing to what we have. I have to read about > that more. In particular, the parser's chiralityInfo field is not tied to the molecule, so it can go as stale with multiple threads as the AtomContainerSet. |
From: <ra...@ar...> - 2012-11-16 15:37:33
|
That was wrong, apologies. Indeed, at the moment the parseSmiles function is atomar, ie has no state, and so is thread-safe. Having state means there must be a lock that prevents access to the stale object. John, wouls this address your concerns? The ultimate solution would be a different class, probably both subclassing from a common ancestor. But then, there is no chance for it coming to cdk-1.4.x |