Re: [Lurker-users] broken charset error
Brought to you by:
terpstra
|
From: Wesley W. T. <we...@te...> - 2003-02-17 01:13:56
|
On Sun, Feb 16, 2003 at 04:58:23PM -0600, Jamin W. Collins wrote: > I recently noticed that some list messages are not completely displayed > by Lurker. These messages contain non-English characters and are > displayed up to the point of these characters. At this point, the > message display is terminated with the following error: > > *** ERROR: BROKEN CHARSET 'US-ASCII' DURING DECODE *** > > The following two links illustrate this problem: > > http://asgardsrealm.net/lurker/message/E18kWwB-0007Sy-00%40sc8-sf-web1.sourceforge.net.html > http://asgardsrealm.net/lurker/message/20010627201112.91114.qmail%40web11205.mail.yahoo.com.html > > Based on the README, Lurker should be able to handle non-English > characters, right? It definitely does. The problem with these messages is that the charset they report is not the charset they are. To see the problem look at: http://asgardsrealm.net/lurker/mbox/20010627201112.91114.qmail%40web11205.mail.yahoo.com.txt This message clearly states: 'Content-Type: text/plain; charset=us-ascii' yet the body includes some high-ascii value for this JS fellow's name. If the content-type was correct, lurker would have decoded this message. I agree that simply aborting is excessive, however iconv has no way of knowing in the general case (real foreign languages) how to proceed after the encoding has been violated. If you feel that that character should be allowed by us-ascii (which i personally am not certain about), contact the iconv maintainers. I might be convinced to make lurker forcibly restart decoding after that point, but certainly the error message should be included---the email is broken. I have seen this behaviour in a number of messages. It would be nice to track the offending email client and file a bug against them. -- Wesley W. Terpstra <we...@te...> |