Menu

#40 GenerateFragments.java - a possible patch

closed
None
5
2012-10-08
2009-04-26
Boryeu Mao
No

The current algorithm for enumerating Bemis-Murcko fragments in GenerateFragment.java works well when the ring fragments of a molecule are linearly arranged, but misses some fragments when a molecule is 'branched'. For example, the list of murcko fragments for the 'four-rings.mol' (in the attached gzipped-tar archive) does not include one of all four rings. A possible alternative is to enumerate the fragments recursively ('GenerateFragments.java.patch' in the attached archive). Instead of directly generating murcko fragments, the double loop f and g in the original code (beginning at lines 157 and 162 respectively) are used to generate relevant data on linkers that are composed of path atoms only. The information on no-ring linkers are then passed to the newly added method recursiveFragmentAdd().

Discussion

  • Boryeu Mao

    Boryeu Mao - 2009-04-26

    a test case and a possible patch for recursive murcko fragments

     
  • Egon Willighagen

    Christian, can you please have a look at the recent bug report (with unit test) and this patch?

     
  • Rajarshi Guha

    Rajarshi Guha - 2009-06-10

    I took a rather high level look and the patch seems to make sense. However I can't apply the patch (via patch) so can't run it. Also a unit test should be provided

     
  • Boryeu Mao

    Boryeu Mao - 2009-06-10

    Hi Rajarshi - Just re-checked the patch file: it was generated against GenerateFragement.java in 1.2.1, but ran against that in 1.2.3 (with the one extra line properly offset by patch). I am using patch 2.5.9 (on a Debian-based GNU/Linux). Could you describe how patch fails for you, so I could perhaps re-make a '.patch' file that will work? Or should I just attach a patched java file? I will also try to work up a unit test. Thanks, Boryeu

     
  • Rajarshi Guha

    Rajarshi Guha - 2009-06-10

    patch 2.5.8 (OS X Leopard)

    I~/src/java/cdk (uit)$ patch -p3 < ~/Downloads/GenerateFragments.java.patch
    missing header for context diff at line 3 of patch
    can't find file to patch at input line 3
    Perhaps you used the wrong -p or --strip option?
    The text leading up to this was:


    *** /w/cdk-src+libs-1.2.1/src/main/org/openscience/cdk/tools/GenerateFragments.java-orig Fri Apr 17 02:36:29 2009
    --------------------------
    File to patch:
     
  • Boryeu Mao

    Boryeu Mao - 2009-06-11

    I see - 'patch -p3 < (patchfile)' tries to deduce the name of file to patch from the info in the patchfile, and the file from which I made the patchfile (and its name encoded into the patchfile) is in a non-std location , thus the failure. SO this is likely not due to the version of the patch command. The best way would be to apply the patch explicitly:

    $ (cd to the directory where GenerateFragments.java is located)
    $ patch GenerateFragments.java < ~/Downloads/GenerateFragments.java.patch
    patching file GenerateFragments.java
    Hunk #1 succeeded at 125 (offset 1 line).
    Hunk #2 succeeded at 202 (offset 1 line).
    Hunk #3 succeeded at 265 (offset 1 line).
    Hunk #4 succeeded at 298 (offset 1 line).
    Hunk #5 succeeded at 420 (offset 1 line).
    Hunk #6 succeeded at 434 (offset 1 line).
    $

    (The above output came from patch command applying the patchfile to GenerateFragments.java in version 1.2.3) (For version 1.2.1, there would be no offsets)

     
  • Boryeu Mao

    Boryeu Mao - 2009-06-12

    unit tests for 'GenerateFragments.java-patch'

     
  • Boryeu Mao

    Boryeu Mao - 2009-06-12

    Unit test Test18 for GenerateFragements.java is enclosed in the tgz file 'GenerateFragments-recursive-unitTest.tgz', intended for the recursive enumeration routine. A second test (Test19) in the tgz file is intended for the fix of Tracker:Bugs ID 2871449; the fix is a part of 'GenerateFragments.java.patch'.

     
  • Egon Willighagen

    Attached a git patch against CDK master.

    I am not very eager to say if this patch is good or not. Firstly, with the current code base I have several failing unit tests; the patch increases this number.

    Some comments for improvement:

    • Use generics for List etc... List<String> is better than just List.
    • if you are not using a Vector in multithreaded way, use the ArrayList impl instead
    • use 120 characters line length
    • use @cdk.bug in the unit test JavaDoc (did that in the git patch)
     
  • Rajarshi Guha

    Rajarshi Guha - 2010-05-25

    CLosing this since a new Murcko fragment class is being written which should be cleaner and more maintanable

     

Log in to post a comment.