Re: [Gaim-devel] [oscar] polish characters

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Robert Gomu=C5=82ka spake unto us the following wisdom:
> When talking to people using ICQ200x. I cannot send and receive polish=20
> characters properly.

This does not surprise me at all, I've had suspicions about ICQ and
custom character sets for some time.

> They appear on both sides as encoded in ISO8859-1.

Yes, it appears that ICQ200x sets both AIM_IMFLAGS_ISO_8859_1 *and*
AIM_IMFLAGS_CUSTOMCHARSET.  I have no idea if that is "correct" or
not, or if it *means* anything or not.

> In fact - they are encoded as CP1250 (Windows-EE) - pseudo
> Microsoft standard.=20

CP1250 is in fact a slight mangling of (the actual standard)
ISO-8859-2 ... I had hoped this would provide us with some information
about the encoding process, but came up with no relation to the=20
(3, 65536) tuple you're seeing.

> As suggested, I applied a small patch to oscar.c:
>         if (args->icbmflags & AIM_IMFLAGS_CUSTOMCHARSET) {
>                 debug_printf ("Custom character set: %d %d\n", args->char=
set,=20
> args->charsubset);
> +                if (args->charset =3D=3D 3){
> +                    tmp =3D g_convert(args->msg, args->msglen, "UTF-8",=
=20
> "CP1250", NULL, &convlen, &err);
> +                    if (err) {
> +                        debug_printf("CP1250 IM conversion: %s\n",=20
> err->message);
> +                        tmp =3D strdup(_("(There was an error receiving =
this=20
> message)"));
> +                    }
> +                }
>         }
>=20
> Why 3? Because it appeared when executed gaim -d. charsubset was 65536.
> It did the thing. I receive messages with proper characters displayed.

This is more or less what I would do, too.  I would like to find more
correlation between charset and charsubset numbers and certain
encodings, but with the limited information we have now this is
reasonable.

> But ...
> What with sending messages?
> I see that they are sent always as UTF. Have no idea how to
> g_convert messages _only_ sent to people using windows icq200x
> client. I am afraid there is a need to follow whole conversation :(
> I don't know a way to _guess_ client version or client encoding.

I suspect that somewhere along the line we are informed that the peer
wishes to use a non-UTF non-ISO-latin-1 non-ASCII encoding.  If
nothing else, the fact that the peer used a custom charset tells us
something.  Perhaps this should be used to set some flags/store some
information in the connection structure.

> Talking to people using gaim (oscar plugin) works perfectly (almost
> perfectly - when I am offline and get message with polish
> characters, after going online, I receive empty or partial message
> - without polish chars).

The only incoming messages I currently handle are standard IMs ...
this will hopefully change in the near future, but I've been pressed
for time lately.

> Have you got any ideas? Maybe I should study other clients code
> (licq, ickle) to find a solution? Or libicq2000?

That may or may not be useful.  It has been my experience that
virtually *all* clients are busted and just send whatever charset they
want to whomever they want at all times.  (name that reference)

It is *possible* that simply unsetting the AIM_IMFLAGS_CUSTOMCHARSET
on sending a message would fix this problem ...  We always set
CUSTOMCHARSET for *every* outgoing packet, and I suspect this is
wrong.  I haven't had time to verify that yet, though, so I haven't
changed anything.  If it *is* wrong, though, the remote client may
even be capable of a UCS2 conversion and not trying it.  It's worth a
shot to unset it and see if the sender-side problem just magically
goes away.

So that's may maybe-useful-maybe-not $0.02.  ;-)

Ethan

--=20
And if I claim to be a wise man / it surely means that I don't know.
                -- Kansas, "Carry on Wayward Son"

Re: [Gaim-devel] [oscar] polish characters

A universal instant messaging (IM) program

Re: [Gaim-devel] [oscar] polish characters