#1458 UTF-8 messages are not decoded when using reply

closed-fixed
Compose (426)
5
2004-11-07
2003-09-15
Christian
No

When replying to messages sendt with UTF-8 encoding the
compose text area does not decode the text. This results in
undecoded UTF-8 "garbage-characters".

As a quickfix on my local installation I've added this (diff -u):

@@ -743,6 +743,7 @@
}
unset($rewrap_body[$i]);
}
+ $body = utf8_decode($body);
$body = getReplyCitation($from) . $body;
$composeMessage->reply_rfc822_header =
$orig_header;

in src/compose.php, but this may break other things I'm not
aware of...

(version <= 1.4.1)

Discussion

  • Christian

    Christian - 2003-09-17

    Logged In: YES
    user_id=680481

    Oops..., I guess it needs to check the header if it really is a UTF-8
    encoded message, or else this fix messes up other encodings.

     
  • Christian

    Christian - 2003-09-17

    Logged In: YES
    user_id=680481

    I seem to be having a conversation with myself here... oh, well.

    I've been digging a bit more into my problem and the charset
    information seems to vanish down a black hole called
    processParameters() in class/mime/Rfc822Header.class.php. I don't
    know much about the contents of the RFCs, but the headers I get
    usually look like this:

    Content-Type: text/plain; charset=UTF-8; format=flowed

    The UTF-8 information never gets into the $message->rfc822_header
    array. I found a "FIXME" in processParameters() so maybe someone is
    working on it?

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2003-09-28

    Logged In: YES
    user_id=225877

    SquirrelMail does not do any decoding when you reply or
    forward message. Doing it may result incorrectly formated
    emails.

    For example. If you use iso-8859-1 encoding and reply to
    utf-8 or koi8-r email with Cyrillic symbols. Browser
    actually lets you read and submit Cyrillic in text box. But
    it translates these symbols to html codes and resulting
    email will contain &name; or &#number; instead of Cyrillic.

    It does not look good in mutt and mozilla. Not sure if
    Outlook Express is able to parse it. That's the reason why
    patch which runs decoding when replying or forwarding
    haven't reached SM yet.

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2003-11-12
    • assigned_to: nobody --> tokul
     
  • Pavel Vondricka

    Pavel Vondricka - 2003-12-30

    Logged In: YES
    user_id=876771

    Actually, message body is never converted from the original
    charste to the default charset, when trying to reply or
    forward an old message. I found a solution, which may be
    universal - I added this line:

    $body = charset_decode
    ($message->header->getParameter('charset'), $body);

    to cca. line 665, before "switch($action)" ... in function
    "newMail(...)"
    If it isn't right, sorry, I'm not a hacker, but it works for
    me and it took meny hours to solve this for me. Actually, my
    problem is the other way: I want to use UTF-8 as default,
    but want to reply to (and forward) messages written in
    iso-8859-1 and iso-8859-2 as well.

    Pavel

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2003-12-30

    Logged In: YES
    user_id=225877

    Problem No.1. utf8_decode -- Converts a string with
    ISO-8859-1 characters encoded with UTF-8 to single-byte
    ISO-8859-1.

    iso-8859-1 encoding is used only in some of SquirrelMail
    translations.

    Problem No.2. there maybe other encodings. email can contain
    two mime blocks encoded with different charsets. I think,
    your patch assumes that entire email is utf-8 encoded.

    Similar code is included in 1.5.0cvs, but it has some
    limitations. Headers are not yet decoded.

    biggest problem - utf-8 supports more symbols than charset
    that you use. You cann't do decoding from utf-8 to regular
    iso-8859-x or windows-125x charsets.

    Real fix would have to :
    1. add decoding to reply/forward/mdn functions
    2. add checks that do not allow sending emails with
    unsupported symbols.

    currently I am working on first one and trying to put
    decoding in compose, when it is possible.

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2004-08-27

    Logged In: YES
    user_id=225877

    SquirrelMail 1.5.1cvs from 2004-08-27 includes option that
    allows loosy charset conversion in compose window. See 10.
    Languages -> 6. Enable loosy encoding.

    If this option is enabled, squirrelmail should convert
    message from utf8 to iso-8859-1. If email contains
    characters unsupported by iso-8859-1, they will be replaced
    with question marks.

    Currently squirrelmail 1.5.1cvs supports iso-8859-1 and
    utf-8 encoding. other charsets will use us-ascii encoding
    that converts all 8bit symbols to question marks.

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2004-11-07

    Logged In: YES
    user_id=225877

    Fixed in 1.4.4cvs and 1.5.1cvs. lossy_encoding option must
    be set to true, because iso-8859-1 does not cover all
    symbols supported by utf-8 charset.

     
  • Tomas Kuliavas

    Tomas Kuliavas - 2004-11-07
    • status: open --> closed-fixed
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks