From: Egon W. <ego...@gm...> - 2011-06-19 12:21:10
|
On Sun, Jun 19, 2011 at 12:24 PM, Andrew Dalke <da...@da...> wrote: > On Jun 19, 2011, at 9:16 AM, Egon Willighagen wrote: >> Andrew, maybe we should write up a review on aromaticity in >> cheminformatics? Set up a simple test set of corner cases, describe >> the algorithms around, and show their limitations, using Open Source >> implementations? > > I am not interested in doing so. Well, neither am I really :) > The algorithms are already well-enough known Maybe for OpenEye users, but not in general. At least the whole ideas that there are different definitions of aromaticity seems very much lost in literature, from my perspective. I will try to find time to read up on their documentation, and should probably get myself academic licenses for OpenEye and other proprietary tools. (I have not made time yet to read through all licenses to make sure I am allowed to develop CDK stuff, while having such licenses. This sounds absurd, but has been a problem in the past 8 years!) > that, for example, > OpenEye implements not one but multiple families of aromaticity > perception, including those from other vendors. Good! OpenEye has been doing it right here. The CDK only implements one algorithm. The CDK could use more approaches; I will have to look at what OpenEye does. > One of these is the MMFF aromaticity model, which is well-described > and also implemented in OpenBabel. Happy to hear that, and I am happy to hear you talk about families and models here. Because I have not seen such talk, and precisely how I see this. E.g. this volume calculation paper I just looked at does not describe *which* model/family they are using for aromaticity. This is the big problem here, because their parameters are effected by it. > Is your point that knowledge, no likely described in the > literature, just hasn't made its way into CDK? (I haven't looked > at RDKit to see how it implements this, so I can't say that it's > a general free software issue.) One point indeed is that the CDK implements one model right now, and I welcome alternative methods. > Such a study would be extremely tedious and I don't understand > what the end goal would be. Would it be to develop a better > aromaticity model? In which case it would need a diverse set > of structures where the aromaticity is known experimentally. That would in fact be a very good goal, but the problem here is indeed that such experimental data is not omnipresent. > Would it show problems in the overall definition of aromaticity, > or mostly highlight limitations in the specific implementations > of the algorithm? I think it would be good for the larger cheminformatics community to actually understand 'aromaticity', because it is one significant source of incompatibility between toolkits right now. This would be tedious, and not directly resulting in new applications. I would help the community, though. Egon -- Dr E.L. Willighagen Postdoctoral Researcher Institutet för miljömedicin Karolinska Institutet (http://ki.se/imm) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers |