|
From: Joachim E. <joa...@gm...> - 2005-10-17 21:25:52
|
Am Montag, 17. Oktober 2005 19:54 schrieb Matus Lipka: > Joachim, > > Are you using MBCS? From my humble knowledge of multibyte character > encodings, I was under the impression that UNICODE is always 2 bytes per > character. So once a file is detected as being in UNICODE format, *all* > characters are interpreted as 2 bytes. If not, then all characters are 1 > byte, and the =C2=B0 and likewise characters are never interpreted as mut= ibyte. > > These kind of encodings shouldn't be mixed together in a single file, > unless something weird like MBCS is used (which could be a non-default > option in KDiff). > > Does this make sense? > > Cheers, > > Matus Hi Matus, The term "Unicode" covers both. You might want to read=20 http://en.wikipedia.org/wiki/Unicode Since the name "Unicode" doesn't stand for any specific encoding the names= =20 UTF-8 or UCS-2 are used to be more precise. In any case UTF-8 (which is an 8-bit, variable-width encoding) is becoming= =20 very popular and is often the default (especially on Linux-machines). But KDiff3 should try to honor the default setting for every individual=20 machine. Cheers, Joachim |