Menu

#1930 full contents does not handle unicodefile in folder compare

Branch_+_Trunk
closed-fixed
nobody
5
2010-08-12
2009-06-05
Matthias
No

see forum open discussion 'UCS-2 diff problem by duncan_lilly '
While checking I found that in foldercompare we are not supporting an unicode convert to UTF-8.
In case of UCS2LE:
As all filters and compare options cannot work as expected.
So ignore empty line sets all diff to trivial, as we detect the last(first for empty lines) char
in line as a '\0'.

Discussion

  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    What content is in what encoding when the bug happens? Be more specific and don't make it sound like nothing works since you seem to know the exact problem.

    I'm quite confused about this now: all data to diffutils is converted to UTF-8. And diffutils handles filtering blank lines. Is the problem that some data is not converted to UTF-8 before sending to diffutils?

    How is your work in patch
    #2477680 unicodefile to compare
    http://winmerge.org/patch/2477680
    related? I thought it fixed these problems?

     
  • Matthias

    Matthias - 2009-06-05

    >#2477680 unicodefile to compare
    has nothing todo here.
    Patch is realted to GUI, there all is working by Transform2FilesToUTF8()
    but DiffUtils? is nothing calling in this direction.
    WM just detects for binary to change to byte2byte compare.
    UCS2LE creats the problem...

    >I'm quite confused about this now: all data to diffutils is converted to
    >UTF-8. And diffutils handles filtering blank lines. Is the problem that
    >some data is not converted to UTF-8 before sending to diffutils?
    not one char is converted! Or did I miss something?

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    > not one char is converted!
    WTF?

    What is not converted and where it is not converted? What is missing? How filters relate to this? How ignoring blank lines relate to this?

     
  • Matthias

    Matthias - 2009-06-05

    I add the samples from forum
    You can modify your code in analyze_hunk()
    >if (!ignore_blank_lines_flag || (!iseolch(files[0].linbuf[i][0]) &&
    >files[0].linbuf[i][0] != 0))
    check here the content linbuf[i][0] and linbuf[i][1]
    you will see still UCS2LE.
    So it cannot work ok.

     
  • Matthias

    Matthias - 2009-06-05

    sample from forum

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    That would mean Unicode file compare is completely broken.

    If I compare two Unicode files UCS2-LE and UCS2-BE from Testing\Data\Unicode folder I get correct results. So Unicode file compare works. And files must be converted to UTF-8 since otherwise those files simply could not be compared by diffutils.

     
  • Matthias

    Matthias - 2009-06-05

    >And files must be converted to UTF-8 since otherwise those files
    >simply could not be compared by diffutils.
    that's what I mean.
    If testing the samples, all must be differend, but two show as equal.

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    All files are compared as different for me. Using WinMerge 2.13.8 experimental. With and without ignore blank lines -option set.

    I still cannot reproduce this problem.

     
  • Matthias

    Matthias - 2009-06-05

    set option 'ignore empty lines on.'
    with off it shows all as differ

     
  • Matthias

    Matthias - 2009-06-05

    sorry two as diff, two as equal

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    > set option 'ignore empty lines on.'
    There is no such option in WinMerge GUI. There is 'Ignore blank lines' in compare options. And as I already wrote I tried enabling and disabling that option.

    Weird, now that I tried it once more time I finally see the problem.

    All files are UCS2-LE as shown by WinMerge. So conversion works for some files but not all files? Both identical determined files have correct BOM bytes so they should be detected correctly as UCS2-LE Unicode files.

     
  • Matthias

    Matthias - 2009-06-05

    That's also a known bug, option are only working for new foldercompare.
    unicode is detected, only I miss the convert where WM should do it?

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    > That's also a known bug, option are only working for new foldercompare.
    Known by who? Is there a bug report about this?

     
  • Matthias

    Matthias - 2009-06-05

    Folder Compare: Change of "Compare Method" not used - ID: 2796786

     
  • Kimmo Varis

    Kimmo Varis - 2009-06-05

    No, that bug is different. It is different thing to refresh all files (what I was doing) after changing options than select couple of files and try to compare them with new settings. Refreshing all files has been working and I've been using it a lot. Selecting couple of files to compare with different settings is area I don't know how it should work. We don't have thing like per-file compare settings. And I'm not sure if we really need it, it would make things quite a bit more complex.

     
  • Matthias

    Matthias - 2009-06-05

    Most options are only supplaied on a new folder compare. Same happend for example also for colors. etc.
    But I think that should be discuss in the other item.

     
  • Jack Houlson

    Jack Houlson - 2010-08-11

    Is this fixed in which version?

     
  • Jack Houlson

    Jack Houlson - 2010-08-11

    Or if it is still not fixed, when can it may be?

     
  • Jack Houlson

    Jack Houlson - 2010-08-11

    (It is a very not good bug, yes?)

     
  • Matthias

    Matthias - 2010-08-12
    • status: open --> closed-fixed
     
  • Matthias

    Matthias - 2010-08-12

    solved with ID:3012200 Defer unicode conversion to diffutils

     

Log in to post a comment.