From: Stefan K. <ste...@eb...> - 2008-12-19 17:27:58
|
Hi all, I came once more across a constant problem, which is preconditions, i. e. things which are required by a method to be done before. I think we do not have the policy to check this in the class, but then at least documentation should be there. Take the class LonePairElectronChecker. All methods as far as I can see rely on atom.getAtomTypeName(), which returns null, if atoms type detection has not been done before, right? But the javadoc of the class does not mention anything. I think this is really bad (and if the LonePairElectronChecker does not need atom type perception or I missed the comment, forgive me, but I am sure there are many classes suffering from that problem. Season greetings Stefan -- Stefan Kuhn B. Sc. M. A. Software Engineer in the Chemoinformatics and Metabolism Team European Bioinformatics Institute (EBI) Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD UK Phone +44 1223 49 2657 Fax +44 (0)1223 494 468 |
From: Rajarshi G. <rg...@in...> - 2008-12-19 18:43:16
|
On Dec 19, 2008, at 12:27 PM, Stefan Kuhn wrote: > Hi all, > I came once more across a constant problem, which is preconditions, > i. e. > things which are required by a method to be done before. I think we > do not > have the policy to check this in the class, but then at least > documentation > should be there. > Take the class LonePairElectronChecker. All methods as far as I can > see rely > on atom.getAtomTypeName(), which returns null, if atoms type > detection has > not been done before, right? But the javadoc of the class does not > mention > anything. I think this is really bad (and if the > LonePairElectronChecker does > not need atom type perception or I missed the comment, forgive me, > but I am > sure there are many classes suffering from that problem. I agree that this is a pain. The only way out as far as I can see, is to update docs when faced with the problem. More rigorously, the code itself should be making the appropriate checks. In general what I'd like is to have a set of flags indicating that some operation has been performed, such aromaticity detected, atoms typed etc. This would be one way to explicitly check for this (and would also be part of a caching scheme) ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 ------------------------------------------------------------------- A mathematician is a device for turning coffee into theorems. -- P. Erdos |
From: Egon W. <ego...@gm...> - 2008-12-19 19:32:50
|
On Fri, Dec 19, 2008 at 6:27 PM, Stefan Kuhn <ste...@eb...> wrote: > I came once more across a constant problem, which is preconditions, i. e. > things which are required by a method to be done before. I think we do not > have the policy to check this in the class, but then at least documentation > should be there. Unwritten policy is actually for the CDK *not* to do input checking, and leave that to the one is is using the building blocks... like it or not, that has been the policy. It's not written in stone, and they has been a few wishes expressed to do something differntly.. > Take the class LonePairElectronChecker. All methods as far as I can see rely > on atom.getAtomTypeName(), which returns null, if atoms type detection has > not been done before, right? Not done, or when the chemistry is not recognized. For example, a 3 coordinate, neutral oxygen... > But the javadoc of the class does not mention anything. File a bug report. And if you discovered the author, you could also assign the bug to him, or email him to request clarification via the mailing list. I agree that JavaDoc is a problem in the CDK... we all know, and Nightly [0] is actually more than daily reporting about missing JavaDocs... some things lack any form of documention... > I think this is really bad (and if the LonePairElectronChecker does > not need atom type perception or I missed the comment, forgive me, but I am > sure there are many classes suffering from that problem.) I am sure there are... just check Nightly. I share your concern, and hope to see things better documented. Note that it should be the author of the source code who should write the JavaDoc, not anyone else. It's the author who knows what the code should or should not do. That said, people can write patches, to make it easier for the original author to fix the bug. To check who is the original author of some method, please use 'svn annotate' or 'git annotate', which is far more accurate than @cdk.author, @author or copyright notices. Egon 0.See the DocCheck section @ http://pele.farmbio.uu.se/nightly-1.2.x/ or Rajarshi's Nightly server. -- ---- http://chem-bla-ics.blogspot.com/ |
From: Rajarshi G. <rg...@in...> - 2008-12-19 19:34:50
|
On Dec 19, 2008, at 2:32 PM, Egon Willighagen wrote: > On Fri, Dec 19, 2008 at 6:27 PM, Stefan Kuhn > <ste...@eb...> wrote: >> I came once more across a constant problem, which is >> preconditions, i. e. >> things which are required by a method to be done before. I think >> we do not >> have the policy to check this in the class, but then at least >> documentation >> should be there. > > Unwritten policy is actually for the CDK *not* to do input checking, > and leave that to the one is is using the building blocks... What's the rationale for this? ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 ------------------------------------------------------------------- The emacs religion: to be saved, control excess... |
From: Egon W. <ego...@gm...> - 2008-12-19 19:49:49
|
On Fri, Dec 19, 2008 at 8:34 PM, Rajarshi Guha <rg...@in...> wrote: >> Unwritten policy is actually for the CDK *not* to do input checking, >> and leave that to the one is is using the building blocks... > > What's the rationale for this? Control, performance, IIRC. The input data can in very many ways mess up the algorithm... chemoinformatics does not have the well defined semantics as, for example, numerical analysis... BTW, here and there input checking is actually done... The LPChecker, for example, could do atom typing itself, but might then redo something just done already. Same for ring detection, aromaticity... Instead, the choice was made to have building blocks... hence the name Chemistry Development Kit... If the interfaces had a clear separation of mutable and immutable data models, keeping track of what has already been calculated... and then methods like the LPChecker can decide itself when they have to (re)do some precalculation... Egon -- ---- http://chem-bla-ics.blogspot.com/ |
From: Rajarshi G. <rg...@in...> - 2008-12-19 20:00:08
|
On Dec 19, 2008, at 2:49 PM, Egon Willighagen wrote: > On Fri, Dec 19, 2008 at 8:34 PM, Rajarshi Guha <rg...@in...> > wrote: >>> Unwritten policy is actually for the CDK *not* to do input checking, >>> and leave that to the one is is using the building blocks... >> >> What's the rationale for this? > > Control, performance, IIRC. Aaah. I'd say that input checking is different from input processing. In the sense that, IMO, code should check that the input has what they need. If not, throw an exception etc, but not necessarily actually do the pre-processing job itself > The LPChecker, for example, could do atom typing itself, but might > then redo something just done already. Same for ring detection, > aromaticity... I agree > Instead, the choice was made to have building blocks... hence the name > Chemistry Development Kit... > > If the interfaces had a clear separation of mutable and immutable data > models, keeping track of what has already been calculated... and then > methods like the LPChecker can decide itself when they have to (re)do > some precalculation... I see. What about the idea of flags? It would require quite a degree of manual examination to decide when and under what conditions a flag needs to be of or on - but would be good for such preconditioning and wouldn't be a big performance hit ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 ------------------------------------------------------------------- Chemistry professors never die, they just fail to react. |
From: Egon W. <ego...@gm...> - 2008-12-19 20:08:38
|
On Fri, Dec 19, 2008 at 8:59 PM, Rajarshi Guha <rg...@in...> wrote: > On Dec 19, 2008, at 2:49 PM, Egon Willighagen wrote: >> Control, performance, IIRC. > > Aaah. I'd say that input checking is different from input processing. > In the sense that, IMO, code should check that the input has what > they need. If not, throw an exception etc, but not necessarily > actually do the pre-processing job itself <snip> >> If the interfaces had a clear separation of mutable and immutable data >> models, keeping track of what has already been calculated... and then >> methods like the LPChecker can decide itself when they have to (re)do >> some precalculation... > > I see. What about the idea of flags? It would require quite a degree > of manual examination to decide when and under what conditions a flag > needs to be of or on - but would be good for such preconditioning and > wouldn't be a big performance hit I agree that prechecking is not the same as preprocessing... but I do suspect there is a lot of overlap... unless you have a very strict flag invalidation system... otherwise, deciding if atom and bond ring flags are still valid just require a recalculation... We could experiment with with having methods like addAtom() invalidate flags and sorts... In particular when the new unit testing system for the data classes is online (data, datadebug and nonotify)... Egon -- ---- http://chem-bla-ics.blogspot.com/ |
From: Rajarshi G. <rg...@in...> - 2008-12-19 20:15:44
|
On Dec 19, 2008, at 3:08 PM, Egon Willighagen wrote: > > We could experiment with with having methods like addAtom() invalidate > flags and sorts... In it's crudest form, any addXXX() method would result in invalidation of all flags (aromaticity, atom typing etc). From there oen could start examining individual methods to see if it si actually required. ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 ------------------------------------------------------------------- Psychology is merely producing habits out of rats. |