From: John L. <jla...@gm...> - 2012-06-12 04:22:46
|
On Mon, Jun 11, 2012 at 3:26 PM, Paul K <pau...@ya...> wrote: > Hi All, > > I've been working on a wxLua-based IDE > (https://github.com/pkulchenko/ZeroBraneStudio/) and have a bug opened > about dealing with utf-8 encoding > (https://github.com/pkulchenko/ZeroBraneStudio/issues/7). I've > captured relevant information in the ticket; to summarize, I seem to > be setting correctly both the the codepage > (SetCodePage(wxstc.wxSTC_CP_UTF8)) and the encoding > (wxFONTENCODING_UNICODE), but the text is still shown as single-byte > garbled content (there are some examples in the ticket), even though > the editor seems to recognize that these are two-byte characters (as > it doesn't allow to position the cursor in the middle of any unicode > character). Can you paste this into the editor from Firefox? It works fine in Linux with the Unicode build. Sanskrit: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥ > Also, when I set wxFONTENCODING_UTF8, I get "No font for displaying > text in encoding 'Unicode 8 bit (UTF-8)' found." and even after I > select "Lucida Sans Unicode", which is reported to be a unicode font, > I still have the issue above. > > I also found this message from John "They probably compiled it in ANSI > mode and not Unicode. Note however > that Lua is strictly ANSI only." in this thread > (http://comments.gmane.org/gmane.comp.lib.wxwidgets.wxlua.user/2636). > Does this mean that the binaries I'm using need to be compiled with > some other configuration (Unicode mode)? Note that in my case I'm not > manipulating UTF strings, I just want them to be displayed correctly > in the editor. Where did you get your binaries? The Windows 2.8.10 ones at wxlua.sf.net are compiled in ANSI and not Unicode. The idea was that since Lua is ANSI wxLua might as well be too since multibyte chars will not be handled properly in the Lua string.XXX functions. However, if you are careful it can be made to work. > What am I doing wrong here? Is there *any* WxLua based application > working on windows with UTF8 encoded text (with or without BOM)? I've > tested with both 2.8.7 and 2.8.10 (on windows Vista) with the same > result. WxLua and wxLuaEdit also show the same behavior. You have to recompile wxLua for Unicode, linking against a Unicode wxWidgets build. When compiled in ANSI mode, strings are considered one char per byte and that's it. I plan on providing binaries for Windows compiled in Unicode for the next release. Regards, John |