From: SourceForge.net <no...@so...> - 2012-08-30 15:18:53
|
Patches item #3562570, was opened at 2012-08-28 18:51 Message generated for change (Comment added) made by jwmay You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=3562570&group_id=20024 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: master Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Asad (asadrahman) Assigned to: Egon Willighagen (egonw) Summary: SP Fingerprint Initial Comment: Attached is the Shortest path based fingerprinter. more info: http://chembioinfo.com/ http://wp.me/JsHe Patch is here https://github.com/downloads/asad/ShortestPathMoleculeFingerprinter/shortest_path_fingerprint.patch ---------------------------------------------------------------------- Comment By: John May (jwmay) Date: 2012-08-30 08:18 Message: Just checking, would it be possible to remove netbeans project files in nbproject/ ---------------------------------------------------------------------- Comment By: Asad (asadrahman) Date: 2012-08-30 07:40 Message: Hi John, Thanks for the detailed review. Here is the latest patch. https://github.com/downloads/asad/ShortestPathMoleculeFingerprinter/SP_fingerprinter_updated_patch_with_reviewers_recommendations.patch a) I have made changes as recommended by you (most of therm). b) Let me know if it works at your end. c)I I will check the failed case once you are done with this patch. Just to clarify. Should the fingerprint report that they are subsets? Many thanks Asad ---------------------------------------------------------------------- Comment By: John May (jwmay) Date: 2012-08-30 03:55 Message: Okay I found the problem. The super class test bug706786 fails as the fingerprint of one molecule is a subset of the other. I've attached the images of the molecules to this tracker. Original Bug: https://sourceforge.net/tracker/?func=detail&aid=706786&group_id=20024&atid=120024 ---------------------------------------------------------------------- Comment By: John May (jwmay) Date: 2012-08-30 03:31 Message: Urgh sourceforge's comment box locked at a tiny size, excuse my grammar in that last comment :p. The AtomComparator needs to be an inner class or in a file on it's own. The cdk module/deplibs etc wasn't picking it up in the jar so I had to move it to an inner class of the simple canonicalizer. I tried adding a module tag but I guess it only picks up one per class. Anyways the all new tests pass as expected. AbstractFixedLengthFingerprinterTest is throwing some null's around but I can't immediately see why.... J ---------------------------------------------------------------------- Comment By: John May (jwmay) Date: 2012-08-30 03:21 Message: Hi Asad, It looks good I'm just trying to run the test (having problems making sure all deps are in place). I few a couple of comments/questions though (yay for sourceforge's inability to nicely add code to comments). Questions/Comments: StringBuffer/Synchronised Collections Maybe I missed something but I couldn't see why you need to use StringBuffer opposed to builder (which you've used in other cases). The SimpleAtomCanonicalisation also uses Collections.synchronizedCollection(). The overhead of synchronising isn't worth it unless the collections are explicitly going to be shared between threads. In this case the SimpleAtomCanonicalisation is only used in a single method (and can only be called in a single thread). As I said maybe I missed something but I couldn't see a use synchronisation. SimpleAtomCanonicalisation Could you add some documentation to the top/methods. Possibly explain why it's needed and where it's being used. AtomTyping/Aromaticity Detection I'm not sure whether performing the atom typing/aromaticity detection within the bitset is a good idea. This is more a problem with having to explicitly type atoms - in fact I had an idea on how to do an AutoTypingAtomContainer but that's for another day. The atom typing method is relatively slow and forcing the atoms to be typed every time you want to calculate the fingerprint isn't necessary if they're already typed. As the typing only needs to be done when you load the container (or change it) and the "role" of this class is to generate a fingerprint not type and generate a fingerprint - I would be tempted to remove that. If it really should stay then I would add a comment to the documentation that the method will re-type and detect aromaticity. I think adding commons-math is a good idea. Although it's only used for the number generation here it could be useful elsewhere and replace some methods and having used it before it's definitely a good inclusion. - ShortestPathFingerprinterTest should be in module test-fingerprint - commons-math*.jar needs to be added to test-fingerprint.libdepends - AtomComparator needs a cdk-module I think... Minor Coding Suggestions: AtomComparator: - o1.getSymbol().compareToIgnoreCase(o2.getSymbol()) == 0 in the conditional can be replaced with equalsIgnoreCase() RandomNumber: - generateMersenneTwisterRandomNumber casts an int to long (return type) then when it's used in shortest path fingerprinter it casts the long back to an int. I would make generateMersenneTwisterRandomNumber return an int instead. - generateMersenneTwisterRandomNumber(int maximum) is never used ShortestPathFingerprinter: - I would make the RandomNumber a field as opposed to using it as the superclass. It also means that if you want to change the random number generation method (I normally use an XORShift) you can do so at a later date. Autoboxing: all instances of hash.intValue() and new Integer() can be removed. Autoboxing takes care of this and makes the code easier to read. StringBuilder can take primitives in it's append method: - radicalInformation.append("RAD: ".concat(String.valueOf(container.getSingleElectronCount()))); can be replaced with (also faster) radicalInformation.append("RAD: ").append(container.getSingleElectronCount()); MFingerprintTests - This could be done after but ShortestPathFingerprinterTest should be added to MFingerprintTests Thanks, J ---------------------------------------------------------------------- Comment By: Asad (asadrahman) Date: 2012-08-29 04:46 Message: Kindly ignore the previous patch as its not complete. Please use the latest patch. https://github.com/downloads/asad/ShortestPathMoleculeFingerprinter/correct_code_for_shortest_path_fingerprint.patch ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=320024&aid=3562570&group_id=20024 |