Version: 6.5.5 Unicode
OS: Win XP 32 bits
Notepad++ doesn't correctly detect encoding of UTF-8 without BOM files containing "§" (U+00A7, UTF-8 sequence
C2 A7). It detects TIS-620 (Thaï).
The document should be detected as UTF-8 without BOM and be left unchanged.
The document is detected as TIS-620 and all occurences of "§" have been replaced with the two Thaï symbols "ยง" (U+0E22 "Yo Yak" and U+0E07 "Ngo Ngu") which appear as two question marks in boxes.
When I select the "Encode in UTF-8 without BOM" command, nothing happens.
When I select "Convert to UTF-8 without BOM", the symbols "ยง" are converted into their UTF-8 sequences (
E0 B8 A2,
E0 B8 87) -- I checked with a hex editor.
This bug pratically prevents the user from using "§" because, each time they open the document, they have to: