From: John M. <joh...@gm...> - 2012-11-29 11:11:15
|
Hi Nina, It might be possible but, they seem to be found from only for certain inputs :/. You can of course write a unit test to confirm the behaviour is as expected but in this case it was obviously intended that if either was null it would return '0'. I guess what is possible is to check every use of Comparator<?> - in fact I'll do that this afternoon. Thanks, J On 29 Nov 2012, at 11:02, Nina Jeliazkova <jel...@gm...> wrote: > John, > > Thanks for posting this. Do you think it is possible to do a systemic testing for the contract violation error ? > > > On 29 November 2012 12:50, John May <jo...@eb...> wrote: > Hi all, > > I'm going to post these on the bug tracker but I thought it would be good to share as the problems (could) affect a fair bit of code and uses. > > 1. The HOSECodeGenerator comparator is in contract violation [src]. This is a lot more subtle then the last one in the CanonicalLabeller - if either value is null the comparator returns '0' which is incorrect. This means if we have a=5, b=10 and c=null then a < b, b < c but a == c and b == c. Stephan and I managed to track it down and have written a patch. > 2. HOSECodeGenerator calculates the canonical labels as many times as there atoms - it only needs to do it once. This means the algorithm is closer to O(n^2) - you need to determine 'n' invpairs 'n' times. > 3. The canonical labeller doesn't handle connected atom count/exp/impl hydrogen count correctly. The code tries to separate the counts into 'total count', 'non-hydrogen-count' and 'hydrogen-count' but it does not check for explicit hydrogens [src]. This is even more subtle and was actually how we found the contract violation (i.e. if only happened when we added implicit hydrogens). > > These should be quite simple to fix but I have a feeling the canonical labeller and hose codes are used a lot. > > > Canonical labeler as part of SMILES parsing is indeed used a lot. > > Best regards, > Nina > > > > Many Thanks, > J > John May | Predoctoral Student – Chemoinformatics and Metabolism > European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK > jo...@eb... | +44–(0) 1223 49 2603 > > > ------------------------------------------------------------------------------ > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. http://goparallel.sourceforge.net > _______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel > > > ------------------------------------------------------------------------------ > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. http://goparallel.sourceforge.net_______________________________________________ > Cdk-devel mailing list > Cdk...@li... > https://lists.sourceforge.net/lists/listinfo/cdk-devel |