UTF-8 encoding not reliably detected for files over 4K
Windows visual diff and merge for files and directories
Brought to you by:
christianlist,
grimmdp
Using WinMerge 2.14.0.0 (Unicode)
In the attached set of three text files (all UTF-8 encoded), there are at most three lines different between any two of them. However, comparing file C-UTF8WithFirstUCAfter4096.txt to either A-UTF8WithFirstUCBefore4096.txt or B-UTF8WithFirstUCBefore4096.txt shows several spurious differences. This happens because WinMerge incorrectly classifies the file as using CP1252 encoding vs. UTF-8. This in turn is apparently due to the first unicode character appearing after byte 4096 in the file.
Comparison files, plus configuration.
The situation is somewhat better in WinMerge 2011:
The status bar shows the wrong encoding, but editing works correctly.
https://bitbucket.org/jtuc/winmerge2011/commits/eb29a0a fixes the status bar issue.
Seems to be fixed in 2.15.2 experimental version