|
From: Bill S. <we...@ri...> - 2001-03-25 21:21:01
|
On Fri, 23 Mar 2001, Eric Peyton wrote: > Got another question for you fine folks. > > OS X is a fully Unicode and UTF-8 aware OS. I currently pass all strings > as UTF-8 and pass the UTF8String to you all, expecting that on the other > end "the right thing" will happen. > > Of course, it doesn't. All my foreign users (i had six complaints this > morning on this) complain about the fact that their umlaut and local > characters get trahsed when sent across icq. > > Does icqlib have a way of handling this? Is there a special library I > need? Hi Eric, Interesting you should mention this, currently I'm hacking the encoding support in icqlib. This is the first I've ever worked with character encodings, so bear with me :) The ICQ protocol uses the Windows codepages to encode strings, not UTF-8. Unix, of course, uses a different set of codepages. Icqlib has always had support to translate between Unix's Cyrillic codepage (koi8-r) and Window's Cyrillic codepage (Windows 1251) since Denis speaks Russian, and just recently I added support for the Czech encodings from a patch that was submitted. Those are the only two translations we support right now. You'll need to translate the UTF-8 strings before you pass them to icqlib, because icqlib is expecting the standard Unix ISO-8859-* encodings. Another (probably better) idea would be to add direct UTF-8 <-> Windows codepage translation code, and then icqlib would support Unicode natively, which would be excellent! I was thinking about this while working on the encoding support. After looking around just now a bit, it looks like glibc has some conversion functions you can use, look for 'iconv'. Maybe we can switch to using this in icqlib too, then we'll support many, many encodings. I don't have time to do this myself right now, and it will likely be a few months before I could get to it, but I'll put it on my TODO list. Of course, we'd gladly accept a patch too! Bill |