#535 enc_utf8 problem with non-lation header

2.17
closed
nobody
5
2012-09-17
2007-09-05
osyp
No

I think most of non-latin languages header, including Thai, must be encoded and span multiple lines.

such as

Subject: =?UTF-8?B?Mi4g7JWg7J246rWs7ZWoICA6IOWbveWutuagh+WHhueggSA6IOC4geC4suC4?=
=?UTF-8?B?o+C5gOC4guC5ieC4suC4o+C4q+C4seC4qg==?=

OR

Subject: =?windows-874?B?Rlc6ICCi6NLHu8PQqtLK0cG+0bm47DogzdS54LfFyuinq9W+1cLZpMfq?=
=?windows-874?B?zbSkzcPsIOPL6c3Rra3SIM251ODBqtHouSDKw+nSp6TH0sHKtOPK48vp?=
=?windows-874?B?odLD7LXZuSA0IEFuZ2llcw==?=

So, removing space makes the function not consider this as encoded text and return it back as plain text. (So did Novell Evolution email client).

=?UTF-8?B?Mi4g7JWg7J246rWs7ZWoICA6IOWbveWutuagh+WHhueggSA6IOC4geC4suC4?==?UTF-8?B?o+C5gOC4guC5ieC4suC4o+C4q+C4seC4qg==?=

After comment out the line
$str = str_replace(" ", "", $str);
this problem has gone.

But also I noticed that english characters had changed to ALLCAPS.

Below is my modified enc_utf8 which works with Thai language.

function enc_utf8($str) {
//some mail clients create encoded strings such: =?iso-8859-1?Q? "Andr=E9=20Mc=20Intyre" ?=
//containing space values inside, but they mustn't. The space values have to be removed before
//they are going to be converted to utf8.

if (preg_match("/=\?/", $str)) {
//  $str = str_replace(" ", "", $str);
//  return imap_utf8($str);
    $text = '';
    if($elements = imap_mime_header_decode($str))
    {
        foreach($elements as $element)
        {
            $head_charset = $element->charset;
            $text .= $element->text;
        }
        switch(strtolower($head_charset))
        {
            case 'utf-8':
                return $text;
                break;

            case 'default':
                //return  imap_utf8($str);
                return  $text;
                break;

            default:
                $converted = @iconv($head_charset, 'UTF-8', $text);
                return  $converted;
        }

    }else {
        if (function_exists('iconv'))
        {
            //if($converted = @iconv('ISO-8859-15', $GLOBALS['charset'].'//IGNORE', $str))
            if($converted = @iconv('windows-874', 'UTF-8'.'//IGNORE', $str))
            {
            return $converted;
            }
        }
    }
}
    return $str;

}

Discussion

  • osyp
    osyp
    2007-09-05

    multiple UTF-8 encoded From, Subject headers

     
    Attachments
  • osyp
    osyp
    2007-09-05

    Logged In: YES
    user_id=1883213
    Originator: YES

    File Added: GO-enc-005.png

     
  • osyp
    osyp
    2007-09-05

    Problem solved with modified enc_utf8 function

     
    Attachments
  • Logged In: YES
    user_id=733975
    Originator: NO

    I took out the space removal. Thanks.