Block detection increase diff count in editor
Windows visual diff and merge for files and directories
Brought to you by:
christianlist,
grimmdp
If I compare two files with moved lines and the option
"Enable moved block detection" is enabled, the
difference count in the editor is higher than the value
at the columns "#differences" and "#sig. differences".
See also forum topic:
Difference count columns in folder compare
https://sourceforge.net/forum/forum.php?thread_id=1538235&forum_id=41639
My test files show 2 differences in the column but 4
differences in the editor.
Tim
tes files
Logged In: YES
user_id=631874
Thanks for test files, I can reproduce this.
Looking at it..
Logged In: YES
user_id=631874
Hmm.. File compare really sees four differences in that
file, you can see that if you select the differences.
This is a bug in moved block detection code.
Logged In: YES
user_id=631874
I don't know or understand that moved block code, but seems
it desparately tries to split differences while looking for
moved lines.
So what happens with files in this bug is those diffs are
actually splitted to two by moved block code, and there
really is 4 diffs. But it looks like there are two, and
folder compare code has no idea about moved blocks so it
shows two.
One way to improve this could be to set minimum lines there
must be identical for block to be considered as moved. Say
we could define that there must be 5 lines identical before
we define it is really a moved block.
But I don't know what to do for splitting diffs.
Logged In: YES
user_id=1195173
It is a semantic difference. Difference in the directory
view are actual differences. Differences in the file editor
view are traversable blocks. These are different because
moved blocks are traversable, so they count in the file
editor view.
To make these agree, one or both semantics would have be
changed -- which might be an appropriate design decision,
but obviously someone would have to decide how to change them.
Alternatively the texts could be changed so they don't look
like a semantic clash-- right now it is confusing because
both places say "differences" even though they're not both
counting the same thing.
Logged In: YES
user_id=1195173
(I actually started to set up to code a patch to fix this,
but as I looked at the mechanics, I realized that it cannot
be simply fixed in code because it is a semantic difference,
that requires either a design change of one or the other, or
a wording change.)
Logged In: YES
user_id=631874
I kinda agree with this. We don't have consistent way to
define moved blocks. They are just hacked into the diff code
now. And I'm afraid in wrong level.
I think moved block detection should be post-filtering,
instead of being part of diffing code. What I suggest is
roughly:
1) compare two files, and generate difference lists as
without moved blocks currently
2) go through difference list, and compare lines in them.
When identical lines are found, either:
2a) use some kind of mapping to map those lines together in
difference list. This must be part of difference list, not
some separate feature
2b) split differences so that moved block is always single
difference. And then map differences. This creates new
difference block type, moved difference and has parameter
which maps it into difference in another side.
3) File and folder compare can now count differences as
usual (case 2b) or differences + additional linemappings
(case 2a).
Logged In: YES
user_id=1689624
Originator: NO
This topic is older, but is there a chance to get good results with block detection? I compared some .php files and there is only a 40% chance that the lines/blocks are detected correctly. If I adjust the two files or the string, it'll display correct. The detection rate is very low...
Logged In: YES
user_id=631874
Originator: NO
You mean detecting moved blocks, right? With current logic there is no possibility for good results as it blindly compares lines in differences.
Logged In: YES
user_id=631874
Originator: NO
Moving moved block detection code out of diffutils code would be a good step anyway. Any custom code in diffutils code means it is harder to update.