Hi all,
I was looking into a few older bugs in the tracker (and fixing some of them) and I came upon this one: bug #1510436 "NPC Names in Russian" (http://sourceforge.net/tracker/index.php?func=detail&aid=1510436&group_id=2335&atid=102335)
I happily coded a solution which, I am thinking, may not be ideal. So I thought I'd run it by you to get ideas.
The root of the problem is that ES uses UTF-8 (because GTK does) while Exult (as U7) uses ASCII (and has characters only for some of the first 128 characters); hence, when editing NPC names (or shape names, or frame names) in ES in other languages (such as French or Russian), ES will return an UTF-8 string which will get truncated by Exult on display.
Please note that in what follows, I loosely use the term "codepage"; using "character set" might be a closer match.
The solution I coded involves translating (using GLib calls) from this UTF-8 string into a char string, trying several codepages (starting with the current system codepage) until one is found that "fits" the string; if none are found, then the string gets translated (with loss) into the system codepage (or into ISO-8859-1 if the system uses UTF-8). Also, when ES receives a string, it converts it into UTF-8 for display. There are several problems with this approach; for example, the codepage that "fits" the string first for translation from UTF-8 may not be the one that you want, and the translation into UTF-8 using the current codepage may not be the one you want. I will call this solution (1). Maybe prompting for a codepage instead would be slightly better.
I then thought of two more solutions:
(2) As above, but allow a user to specify (as a global setting) the codepage he wants to use (as the system codepage may not be what he wants; particularly if the system uses UTF-8) and then translate back and forth into this codepage (allowing lossy conversion, but notifying the user of the issue). This is a lot cleaner, and allows correct back-and-forth translation to be lossless. I think I will implement it this way.
(3) Change Exult to use UTF-8 for NPC/shape/frame names. For starters, I am not sure if vga fonts can have all the required characters (although I think they can), not to mention that it would be a lot of work having to modify all text output, input and processing routines (and not to mention that making a font for a mod would be boring work -- although we could host the fonts in SVN and gradually add characters as users submit them).
As I said, I am leaning towards option (2), although (3) is something to think about. Any thoughts?
--
Marzo Sette Torres Junior --+-- marzojr@...
marzojr@... ----+---- marzojr@...
"Mental slavery is mental death and every man who has
given up his intellectual freedom is the living
coffin of his dead soul."
-- Robert Green Ingersoll, "Individuality" (1873)
|