Menu

#79 Case insensitive diff only enabled via "-iw" switch (not "-i")

V5.2.1
closed
None
8
2021-08-05
2021-07-17
No

Running tkdiff 5.2.1 on simple text files on Linux. I assumed that using the -i option (tkdiff -i <file1> <file2>) would perform a case insensitive diff. This did not seem to work, however. The diff was still case sensitive when using the "-i" switch by itself. However, I did seem to get a case insensitive diff if I also added the "-w" switch along with the "-i" switch ( tkdiff -iw <file1> <file2>)
This seems odd to me! Does anyone else see this as well? Would have thought that "-i" by itself should be sufficient to enable case insensitive diff.</file2></file1></file2></file1>

Thanks for your help.  Matt

Discussion

  • michael-m

    michael-m - 2021-07-17

    Your thinking is correct.

    Although your description of the problem is a bit vague. I am presuming you ran some kind of test situation where you were perhaps expecting TkDiff to report "no differences" on the basis of having two files with much the same content, but differing use of upper/lower case characters, and were surprised when viewing a display that showed highlighting OF those characters.

    That the result CHANGED when adding the "-w" diff option tells me that IN ADDITION to the capitalized letters, there was ALSO some difference in the amount of WHITESPACE on those same lines. By supplying that option, those were ALSO ignored, and at that point, nothing ELSE remained on that line which was allowed (by Diff) to be considered DIFFERENT, thus leading TkDiff to annotate NOTHING on that line!

    BOTH options "did their job"...; but that job (from TkDiffs perspective) is simply to IDENTIFY what lines to consider AS different, which is ENTIRELY handled by Diff! Afterward, TkDiff is then responsible for all the highlighting of what differences EXIST - but is no longer BOUND by the options that were given TO Diff. Part of the reason for this is so TkDiff does not need to understand EVERY OPTION that could possibly be given to Diff (which becomes quite involved), and indeed it does not - it simply PASSES them along and waits for the verdict from Diff of which lines PASSED/FAILED. We do not wish to REINVENT Diff itself.

    SO - what (I believe) you SAW, was some line that actually HAS a real difference in white-space, which WITHOUT the -w WAS REPORTED by Diff. TkDiff then (possibly depending on your personal display settings), HIGHLIGHTED what LOOKS different on that line - and while that WOULD include the REAL "offending" white-space, IT WILL ALSO highlight the upper/lower characters - purely because TkDiff DOESN'T KNOW it was supposedly an ignored "category". Your eye however was most likely DRAWN to the upper/lower characters as they would seem more prominent than an extra space or two here and there, and you thus assumed THEY must be the cause for the difference being reported.

    This situation (displaying of interline differences being agnostic of ANY Diff option) IS specifically mentioned in the online Help, and is thus not a true BUG. Hope this explanation helps.

     
  • michael-m

    michael-m - 2021-07-17
    • assigned_to: michael-m
    • Priority: 5 --> 8
     
  • Matthew Gavin

    Matthew Gavin - 2021-07-17

    Hello Michael -

    after reading the above carefully (thanks for the clear explanation), and doing more experiments, I agree with your assessment.

    I do think that it would be wise for TkDiff to start implementing some of the common, more useful Diff options when graphically annotating difference lines reported by Diff. To the end user, they are looking for a graphical diff that serves their needs; and if it doesn't work as intended, they may look elsewhere. They will likely not care about the purity of keeping TkDiff separate from Diff.

    A sincere thank you for helping me to understand how TkDiff works on top of Diff.

    Matt

     
    • michael-m

      michael-m - 2021-08-05

      You're quite welcome. However, note that "drawing the line" between Diff and TkDiff is only part of the reason for the distinction between the two tools. In addition there is the concern that TkDiff is also a "merge" tool, which is quite often the reason one is looking at differences in the first place. By actually showing the REAL differences between the lines, and not just the ones that the user chose (or not) as significant for initial identification purposes, TkDiff is, in fact, informing the user what will actually become changed when any given line is selected as the "choice" during a merge.

      There is also (to a much lesser degree) the computationally intensive costs associated with trying to adapt the Ratcliff/Obershelp algorithm to handle the numerous "optional" aspects that Diff provides as matching control, which would adversely affect the responsiveness of the TkDiff display operations, but that doesn't mean we didn't try - sometimes you have to pick your battles.

       
  • michael-m

    michael-m - 2021-08-05
    • status: open --> closed
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.