Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#792 Korean characters get displayed wrong

Bug
open-wont-fix
Neil Hodgson
Scintilla (790)
3
2009-04-28
2009-04-27
Anonymous
No

I'm a user of Florian Balmers Notepad2, which use the Scintilla-component - the problem is reproducable in other Scintilla-based programs too, so it's very likely that this problem is a bug in Scintilla itself.

I'm using the german version of Windows XP SP3 Home, with installed support for east-asian languages (Chinese, Japanese, Korean). I use Notepad2 to edit UTF8-encoded-XML files, which contain different languages (English, German, French, Chinese, Japanese, Korean).

Now the problem is, that the Korean characters are sometimes replaced by squares and sometimes they are shown correctly - even in the same file.
While modifying a XML file, one very mysterious thing occured now:
This screenshot show that the Korean characters are shown correctly:
http://www.photome.de/notepad2/Image1.gif

If I mark the "r" of "kr" the Korean characters get replaced by squares:
http://www.photome.de/notepad2/Image2.gif

If I remove the selection the characters are shown correctly again.

This problem seems to occur only for Korean. Other east-asian characters (Chinese and Japanese) are shown correctly.

To help you reproducing the problem, here is the file from the screenshot:
http://www.photome.de/notepad2/nikon.xml

The problem occurs in line 3197, column 151

Discussion

  • Neil Hodgson
    Neil Hodgson
    2009-04-27

    • assigned_to: nobody --> nyamatongwe
    • status: open --> open-works-for-me
     
  • Neil Hodgson
    Neil Hodgson
    2009-04-28

    I can reproduce this with a single character UTF-8 text file containing "설". The problem does not occur if a Korean-specific font such as Gulim is used. It also works if the character set is changed to Korean: in SciTE, character.set=129. The string "未설" does display correctly.

    Tracing into the code shows that even when failing the parameters sent to ExtTextOutW still look good (lpString=L"설", cbCount=1). Return is 0 (success) and last error is S_OK, just like calls that display the correct glyph. Looks to me like this is a problem with Windows display of Korean text when used with non-Korean font or character set.

     
  • Neil Hodgson
    Neil Hodgson
    2009-04-28

    • priority: 5 --> 3
    • status: open-works-for-me --> open-wont-fix
     
  • You are right, if I change the font to Gulim or set the charset to 129, this problem does not occur.
    (I used Lucida Console).

    Anyhow, I prefer monospace fonts, and if I use Gulim or the charset 129, the font isn't monospace anymore (even if I set "Use Monospace font" in SciTE).

    In Windows Notepad and Wordpad the problem does not occur.

     
  • Neil Hodgson
    Neil Hodgson
    2009-04-29

    When you are displaying East Asian characters then you will not really be using a monospaced font: in your example, the Korean characters are 10 pixels wide, Chinese 11 pixels and Roman 8 pixels.

    WordPad automatically detects language and chooses Gulim which is similar to what ExtTextOutW is supposed to do for you: substitute a font containing a character when that character is missing in the chosen font. I don't know what NotePad does.

    You can provoke display of the Korean characters by placing a Japanese or Chinese character next to them or an individual Korean character like "ᄅ" ("설" is a syllable containing 3 characters).

    I can't see anything reasonable for Scintilla to do to fix this. Applications or users can choose to use specific fonts or character sets when this problem is apparent.

     
  • If this is a bug in Windows' underlying API, why are Notepad and WordPad both free of the same problem?