When replying to messages sendt with UTF-8 encoding the
compose text area does not decode the text. This results in
undecoded UTF-8 "garbage-characters".
As a quickfix on my local installation I've added this (diff -u):
@@ -743,6 +743,7 @@
}
unset($rewrap_body[$i]);
}
+ $body = utf8_decode($body);
$body = getReplyCitation($from) . $body;
$composeMessage->reply_rfc822_header =
$orig_header;
in src/compose.php, but this may break other things I'm not
aware of...
(version <= 1.4.1)
Logged In: YES
user_id=680481
Oops..., I guess it needs to check the header if it really is a UTF-8
encoded message, or else this fix messes up other encodings.
Logged In: YES
user_id=680481
I seem to be having a conversation with myself here... oh, well.
I've been digging a bit more into my problem and the charset
information seems to vanish down a black hole called
processParameters() in class/mime/Rfc822Header.class.php. I don't
know much about the contents of the RFCs, but the headers I get
usually look like this:
Content-Type: text/plain; charset=UTF-8; format=flowed
The UTF-8 information never gets into the $message->rfc822_header
array. I found a "FIXME" in processParameters() so maybe someone is
working on it?
Logged In: YES
user_id=225877
SquirrelMail does not do any decoding when you reply or
forward message. Doing it may result incorrectly formated
emails.
For example. If you use iso-8859-1 encoding and reply to
utf-8 or koi8-r email with Cyrillic symbols. Browser
actually lets you read and submit Cyrillic in text box. But
it translates these symbols to html codes and resulting
email will contain &name; or &#number; instead of Cyrillic.
It does not look good in mutt and mozilla. Not sure if
Outlook Express is able to parse it. That's the reason why
patch which runs decoding when replying or forwarding
haven't reached SM yet.
Logged In: YES
user_id=876771
Actually, message body is never converted from the original
charste to the default charset, when trying to reply or
forward an old message. I found a solution, which may be
universal - I added this line:
$body = charset_decode
($message->header->getParameter('charset'), $body);
to cca. line 665, before "switch($action)" ... in function
"newMail(...)"
If it isn't right, sorry, I'm not a hacker, but it works for
me and it took meny hours to solve this for me. Actually, my
problem is the other way: I want to use UTF-8 as default,
but want to reply to (and forward) messages written in
iso-8859-1 and iso-8859-2 as well.
Pavel
Logged In: YES
user_id=225877
Problem No.1. utf8_decode -- Converts a string with
ISO-8859-1 characters encoded with UTF-8 to single-byte
ISO-8859-1.
iso-8859-1 encoding is used only in some of SquirrelMail
translations.
Problem No.2. there maybe other encodings. email can contain
two mime blocks encoded with different charsets. I think,
your patch assumes that entire email is utf-8 encoded.
Similar code is included in 1.5.0cvs, but it has some
limitations. Headers are not yet decoded.
biggest problem - utf-8 supports more symbols than charset
that you use. You cann't do decoding from utf-8 to regular
iso-8859-x or windows-125x charsets.
Real fix would have to :
1. add decoding to reply/forward/mdn functions
2. add checks that do not allow sending emails with
unsupported symbols.
currently I am working on first one and trying to put
decoding in compose, when it is possible.
Logged In: YES
user_id=225877
SquirrelMail 1.5.1cvs from 2004-08-27 includes option that
allows loosy charset conversion in compose window. See 10.
Languages -> 6. Enable loosy encoding.
If this option is enabled, squirrelmail should convert
message from utf8 to iso-8859-1. If email contains
characters unsupported by iso-8859-1, they will be replaced
with question marks.
Currently squirrelmail 1.5.1cvs supports iso-8859-1 and
utf-8 encoding. other charsets will use us-ascii encoding
that converts all 8bit symbols to question marks.
Logged In: YES
user_id=225877
Fixed in 1.4.4cvs and 1.5.1cvs. lossy_encoding option must
be set to true, because iso-8859-1 does not cover all
symbols supported by utf-8 charset.