#847 Cannot merge files with different EOLs

Branch_+_Trunk
open
nobody
5
2005-04-26
2005-04-18
Kimmo Varis
No

2.2.0, 2.2.2 stable releases, 2.3.2.0 beta, 2.3.3.1
experimental.

Noticed this when trying to merge updated Polish ISL file.

1) Load files in attached zip to WinMerge:
- from dir1 to left side and from dir2 to right side
2) Select first difference and try to copy it from left
to right

Updated file is detected as Mac file, while file in CVS
is DOS file. So I'm expecting this has something to do
with EOL detection/compare. But I have no time to look
at details now.

Discussion

1 2 > >> (Page 1 of 2)
  • Kimmo Varis

    Kimmo Varis - 2005-04-18

    Files that I cannot merge

     
  • Kimmo Varis

    Kimmo Varis - 2005-04-18

    Configuration log (from 2.3.3.1)

     
  • Kimmo Varis

    Kimmo Varis - 2005-04-18

    Logged In: YES
    user_id=631874

    Attaching configuration log from 2.3.3.1.

     
  • elsapo

    elsapo - 2005-04-22

    Logged In: YES
    user_id=1195173

    I was going to take a look at this, but decided first to
    construct patch 1187777 ( New LoadConfig function), because
    I don't want to do the work of figuring out which line in
    the config file means I have to do what in the options menu,
    when I can have the computer do all that work for me. :)

     
  • Kimmo Varis

    Kimmo Varis - 2005-04-26

    Logged In: YES
    user_id=631874

    I expected this to be a regression, but it turns out this
    has never worked. I tested several experimental and beta
    builds till 2.3.x versions and none could merge that first
    diff in attached files.

     
  • Kimmo Varis

    Kimmo Varis - 2005-04-26
    • labels: --> File handling
     
  • Johan Boule

    Johan Boule - 2005-06-03

    Logged In: YES
    user_id=96100

    How can a file in a CVS be stored with MS\r\n end of lines?

     
  • elsapo

    elsapo - 2005-06-03

    Logged In: YES
    user_id=1195173

    johan-boule, I'm almost certain that "file in CVS" above
    means a file retrieved by a cvs client, and it was surely a
    cvs client on MS-Windows, so the cvs client adjusted the
    line terminations to produce a DOS style file.

     
  • Kimmo Varis

    Kimmo Varis - 2005-06-03

    Logged In: YES
    user_id=631874

    Correct, it is file retrieved from CVS using WinCVS.

     
  • Kimmo Varis

    Kimmo Varis - 2006-02-05

    Logged In: YES
    user_id=631874

    With current CVS I get ASSERT when trying to load the files.
    So fixes done to diffutils have had effect to this bug, but
    not fixed it yet..

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    I found ASSERT on this earlier today, but now I cannot
    reproduce it -- neither Visual Studio 6 UnicodeDebug, nor
    Visual Studio 6 Debug, nor Visual Studio .NET 2003 Debug,
    nor Visual Studio 2003 UnicodeDebug -- not even when I use
    Load Config to load that config file.

    Did you get ASSERT on cvs trunk code after I applied "PATCH:
    [ 1423907 ] Fix diffutils hashing of \r", and if so, were
    you using attached config log -- and what version were you
    running, in what compiler/IDE?

     
  • Kimmo Varis

    Kimmo Varis - 2006-02-05

    Logged In: YES
    user_id=631874

    Yes, it was after your patch (your patch was the reason I
    wanted to test this again). Now I try this, I get ASSERT
    with several different configs, doesn't seem to matter if I
    ignore whitespaces or not etc. I'm attaching one config log
    from today's CVS trunk.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    Ugh. This is a problem with the last line, and I seem to
    remember someone struggling with this and saying it is
    tricky to get right--was that you, Kimmo, or Laurent?

    Anyway, what happens here is that the left-hand file
    (Unicode mac format) ends in a line with a mac delimiter
    (line 360).

    CMergeDoc::CDiffTextBuffer::LoadFromFile correctly notices
    that the last line had a line termination, and tries to add
    a blank line after it.

    But, CCrystalTextBuffer::AppendLine refuses to add it,
    because it has a test to reject blank lines.

    This seems not to involve mac terminations per se, so I need
    to test how this works with Windows terminations.

     
  • Kimmo Varis

    Kimmo Varis - 2006-02-05

    Logged In: YES
    user_id=631874

    So we've regressed last-line EOL? That is very bad.

    Yes, it is tricky, I first fixed it but then came undo/redo
    rewrites, unicode etc. And laoran fixed it for second time.
    I think it took ten or so patches to get all cases work...

    Laoran's solution wasn't very nice (I never really liked how
    last line with diff behaved), but it worked and I really
    didn't want to touch that code.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    My diagnosis was wrong I've since realized -- the AppendLine
    returns 0, but it doesn't matter, because the loop still
    counts the line, and when the read loop finishes, it sets
    the array size directly. Also, I was looking at wrong side
    -- I was looking at the left side, and the assert fires on
    the right side. Right side does not end in line termination.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    The single diff I'm looking at (the only one I get -- I must
    have my options to not consider line terminations equal, so
    it is considering entire file different) is:

    (begin0,end0)=(1,359)

    (begin1,end1)=(1,360)

    and I'm guessing that that 360 is our problem, because our
    right side (#1) line array is size 360, so 360 is not a
    valid index.

    I guess probably diffutils built the range wrong; not sure yet.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    This is tedious. find_and_hash_each_line ends with line=360,
    and so current->valid_lines=360, which seems correct to me.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    It's probably not relevant, but walking through back up to
    our code, I see in DiffWrapper that inf[*].missing_newline
    are both 1, and the left-hand one should be 0; that is
    probably a mac line termination issue.

     
  • elsapo

    elsapo - 2006-02-05

    Logged In: YES
    user_id=1195173

    Wow, this is a corner case I'd not thought of. This is in
    the prefix code -- where it is supposed to skip to the last
    line-beginning in the prefix (viz., "/* Skip back to last
    line-beginning in the prefix,").

    The first non-matching characters, which ended the prefix,
    are a semicolon on the left, and a \n on the right -- but
    that \n on the right is the second-half of a \r\n.

    I have to think some more about what is the correct thing to
    do when the common suffix ends smack in the middle of a
    windows \r\n.

     
  • elsapo

    elsapo - 2006-02-06

    Logged In: YES
    user_id=1195173

    I tested it against the steps 1&2 listed in original bug
    description above, and when I copy from left to right, it
    seems successful -(how would I know if it failed--did it
    refuse to copy?

     
  • Kimmo Varis

    Kimmo Varis - 2006-02-14

    Logged In: YES
    user_id=631874

    Still fails for me. If you select first difference as active
    difference and try to merge it, does it work for you?

    No ASSERT anymore though.

     
  • elsapo

    elsapo - 2006-02-14

    Logged In: YES
    user_id=1195173

    I think I need to know what but I'm looking for; I'm not
    clear what behavior you saw that is (or may be*) wrong?

    When you say it fails, do you mean that the text on the
    right does not change? (I cannot reproduce that, using cvs
    trunk, Unicode Debug -- unless I use system codepage 1250,
    which you were not doing, and in which case, the left &
    right are the same anyway, so not changing is correct
    behavior I think.)

    I'm quite willing to believe that there is a bug here in
    diffing, but the only one I see is only apparent if I set to
    custom codepage of 1250, in which case that first diff
    should not be being recognized as a diff at all, I think,
    because it isn't a diff -- and the transcoding in our main
    code path going to where we call diffutils is supposed to
    handle this case, I think. :(

    * I said "or may be" because this case is a bit complicated
    -- more complicated than I realized until I happened to turn
    on the encoding columns in the dirview, and discovered that
    we have on the left UCS-2LE, and on the right CP-1250 (which
    we're misinterpreting as CP-1252). Ugh.

     
  • Kimmo Varis

    Kimmo Varis - 2008-08-04

    Logged In: YES
    user_id=631874
    Originator: YES

    Tested with 2.11.1.2 experimental.

    Ok, latest WinMerge versions show also the file encoding. And it appears another file is UCS-2 and another is ANSI. So this is similar/related to bug item:
    #845994 Files wrongly shown as different (Unicode)
    http://winmerge.org/bug/845994

    Not closing as duplicate as I cannot be sure how these two bugs relate to each other. And here we have a test case.

     
1 2 > >> (Page 1 of 2)

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks