Menu

#1273 Block detection increase diff count in editor

Trunk
open
nobody
5
2006-10-20
2006-07-19
Tim Gerundt
No

If I compare two files with moved lines and the option
"Enable moved block detection" is enabled, the
difference count in the editor is higher than the value
at the columns "#differences" and "#sig. differences".

See also forum topic:
Difference count columns in folder compare
https://sourceforge.net/forum/forum.php?thread_id=1538235&forum_id=41639

My test files show 2 differences in the column but 4
differences in the editor.

Tim

Discussion

  • Tim Gerundt

    Tim Gerundt - 2006-07-19

    tes files

     
  • Kimmo Varis

    Kimmo Varis - 2006-07-19
    • assigned_to: nobody --> kimmov
     
  • Kimmo Varis

    Kimmo Varis - 2006-07-19

    Logged In: YES
    user_id=631874

    Thanks for test files, I can reproduce this.

    Looking at it..

     
  • Kimmo Varis

    Kimmo Varis - 2006-07-19
    • assigned_to: kimmov --> nobody
     
  • Kimmo Varis

    Kimmo Varis - 2006-07-19

    Logged In: YES
    user_id=631874

    Hmm.. File compare really sees four differences in that
    file, you can see that if you select the differences.

    This is a bug in moved block detection code.

     
  • Kimmo Varis

    Kimmo Varis - 2006-07-19

    Logged In: YES
    user_id=631874

    I don't know or understand that moved block code, but seems
    it desparately tries to split differences while looking for
    moved lines.

    So what happens with files in this bug is those diffs are
    actually splitted to two by moved block code, and there
    really is 4 diffs. But it looks like there are two, and
    folder compare code has no idea about moved blocks so it
    shows two.

    One way to improve this could be to set minimum lines there
    must be identical for block to be considered as moved. Say
    we could define that there must be 5 lines identical before
    we define it is really a moved block.

    But I don't know what to do for splitting diffs.

     
  • elsapo

    elsapo - 2006-10-04

    Logged In: YES
    user_id=1195173

    It is a semantic difference. Difference in the directory
    view are actual differences. Differences in the file editor
    view are traversable blocks. These are different because
    moved blocks are traversable, so they count in the file
    editor view.

    To make these agree, one or both semantics would have be
    changed -- which might be an appropriate design decision,
    but obviously someone would have to decide how to change them.

    Alternatively the texts could be changed so they don't look
    like a semantic clash-- right now it is confusing because
    both places say "differences" even though they're not both
    counting the same thing.

     
  • elsapo

    elsapo - 2006-10-07

    Logged In: YES
    user_id=1195173

    (I actually started to set up to code a patch to fix this,
    but as I looked at the mechanics, I realized that it cannot
    be simply fixed in code because it is a semantic difference,
    that requires either a design change of one or the other, or
    a wording change.)

     
  • Kimmo Varis

    Kimmo Varis - 2006-10-20

    Logged In: YES
    user_id=631874

    I kinda agree with this. We don't have consistent way to
    define moved blocks. They are just hacked into the diff code
    now. And I'm afraid in wrong level.

    I think moved block detection should be post-filtering,
    instead of being part of diffing code. What I suggest is
    roughly:
    1) compare two files, and generate difference lists as
    without moved blocks currently
    2) go through difference list, and compare lines in them.
    When identical lines are found, either:
    2a) use some kind of mapping to map those lines together in
    difference list. This must be part of difference list, not
    some separate feature
    2b) split differences so that moved block is always single
    difference. And then map differences. This creates new
    difference block type, moved difference and has parameter
    which maps it into difference in another side.
    3) File and folder compare can now count differences as
    usual (case 2b) or differences + additional linemappings
    (case 2a).

     
  • Kimmo Varis

    Kimmo Varis - 2006-10-20
    • labels: --> DIFF Engine
    • milestone: --> Trunk
     
  • Daniel D.

    Daniel D. - 2007-01-11

    Logged In: YES
    user_id=1689624
    Originator: NO

    This topic is older, but is there a chance to get good results with block detection? I compared some .php files and there is only a 40% chance that the lines/blocks are detected correctly. If I adjust the two files or the string, it'll display correct. The detection rate is very low...

     
  • Kimmo Varis

    Kimmo Varis - 2007-01-11

    Logged In: YES
    user_id=631874
    Originator: NO

    You mean detecting moved blocks, right? With current logic there is no possibility for good results as it blindly compares lines in differences.

     
  • Kimmo Varis

    Kimmo Varis - 2008-06-11

    Logged In: YES
    user_id=631874
    Originator: NO

    Moving moved block detection code out of diffutils code would be a good step anyway. Any custom code in diffutils code means it is harder to update.

     

Log in to post a comment.