The patch seems to be wrong. For unsupported charsets it simply ignores header's encoding provided to as the $encoding argument to sm_encode_html_special_chars() and assumes its own "fixed encoding" (either ISO-8859-1 or a default provided in config_local.php). In result, text is processed by htmlspecialchars() using $encodign that doesn't match the input text. Even if the htmlspecialchars() was able to handle this case correctly, and (say) convert automatically the text to this "fixed" encoding, the text returned by sm_encode_html_special_chars() is now encoded in this new "fixed" encoding, which is not expected by the caller and further correct text processing is impossible.
I think, the only correct way to handle encodings incompatible with htmlspecialchars() is to convert the input text to UTF-8 (or any encoding compatible with the input text language and htmlspecialchars()), process it with htmlspecialchars() and then convert it back to the original encoding defined by the encoding argument to sm_encode_html_special_chars(). The performance issue is not an argument here - mail client should simply display messages correctly.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Paweł says the new patch is causing some characters to get mangled (Paweł, your post was moderated, but it was approved, yet SF seems to have ate it for lunch), which is not surprising, since his text is iso-8859-2 which isn't the same as iso-8859-1, and there are going to be a few inconsistencies (but notably, we have received very little in the way of problem reports about this over 10+ years). Please check your same messages under SM 1.4.22, which uses htmlspecialchars() with no charset argument, therefore, the same mangling should happen. If you want to test with the $default_htmlspecialchars_encoding setting in config_local,php, it makes more sense in your case to try setting it to iso-8859-2. I appreciate the problem, but am not too excited about the solution PHP leaves us with.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The behavior of htmlspecialchars() has changed since PHP 5.4. Also note the ENT_IGNORE flag, which has been introduced in PHP 5.3 - maybe the previous versions of htmlspecialchars() worked as if the ENT_IGNORE was in effect and now the function is more choosy?
I think the issue will become more important when more users upgrade their servers to PHP 5.4.
Last edit: Paweł Tomulik 2013-07-11
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Toni, please review this thread and apply the latest patch I have provided. It will not fix all iso-8859-2 characters, but it will make things look better than any previous version of SquirrelMail. I plan an optional, more thorough fix later.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was still having problems with the debian wheezy version of SquirrelMail (1.4.23 [SVN]), and by browsing the code, I see the problem persists. Mainly - htmlspecialchars (inside the sm_encode_html_special_chars) fails on input strings in encodings that are not supported by that function. My solution (though certainly a performance hit, and could be alleviated a bit by switch/case as I don't check if the current encoding is supported by htmlspecialchars) was to 'iconv' the string to utf-8 before calling the htmlspecialchars, and return it iconv-ed back to previous encoding. Fixes the problems for mails that I was getting (iso-8859-2, windows-1250).
I basically changed the:
if (check_php_version(5, 2, 3))
return htmlspecialchars($string, $flags, $encoding, $double_encode);
I finally came back to computer and actually had some time to check the posted patches. Unfortunately I don't have ALL the problematic messages any more (especially the windows-1250 ones), but I checked some of the iso-8859-2 which used to make problems and those seem good now. For the moment it looks as though the patch DID fix the problem, but I'll keep watching and re-post if I come across any more problematic messages.
PS: Thanks for the fixes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have same problem with 1.4.22 in both... subject and sender mail:
Subject: =?windows-1256?B?yNHkx+PMIMfhw+bUxyDcT1NIQdwg3ewgx+HV5Mfa?=
=?windows-1256?B?x8ogx+Hax+PJINxPU0hBIEdlbmVyYWwgSW5kdXN0cmllc9wg38fR7dLjxw==?=
=?windows-1256?B?IC0gMjItz+3T48jR?=
From: =?windows-1256?B?5tnHxt0gx+HT5s/H5A==?= jobs4sd@gmail.com
both are blank!!???
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can anyone (Paweł?) who is using SM 1.4.23-SVN with the most recent patch I provided and who has some emails with mangled characters please upload the full message source and/or paste in a header containing characters that get manged?
Other people posting here (especially if you are seeing entirely blank subject or from fields) sound as if they aren't using SM 1.4.23-SVN or have not applied the patch.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks, but those are not problematic - they display fine. What I'm looking for are messages with characters that don't map back to iso-8859-1 correctly, such as in your "err1" screenshot.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The strings that have munged characters in both of those messages are given as unencoded headers despite the fact that they have 8-bit characters in them. That is a problem with the originating software, not a bug in SquirrelMail.
In the case of these two particular messages, it appears they assume it should be utf-8, so if we switch the Polish translation to utf-8 instead of iso-8859-2, these headers will be displayed correctly, however, when you get similarly malformed messages with unencoded iso-8859-2 (or other) characters, you will see the same "problem."
If you want to help change the Polish translation to utf-8, which is a goal for the SquirrelMail project, please be in touch with the translation maintainers and the squirrelmail-i18n mailing list.
For this issue, does anyone have any properly formed messages that are not displayed correctly with the patch I have provided?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I'm hit by the problem. After some tests, I use openSUSE 13.1 version of squirrelmail, given as "1.5.2-13.1.2", with php 5.4.20.
this version write a (no subject) link when it can't display the subject, but there is a subject! and this subject is not copied to the answer when doing one.
the subject is: " =?iso-8859-1?Q?Nouveau=20d=E9compte=20de=20remboursement=20N=B0=20EU1482320?="
if ever it do not display fine here, screen copy: http://dodin.org/owncloud/public.php?service=files&t=0b655c204c7d02d53318b69b00571f5a
I also do not know where to find the last patch - this thread is very long :-(
thanks
jdd
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) We do not support third party SquirrelMail packages
2) Your link is bad
3) Look for quoted_printable_fix-1.5.2-version_2.diff on the previous page, but again, we make no guarantees about its applicability to third party packages
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I found there's still a problem with mails in other encodings. While the subjects display properly, I occasionally miss whole first paragraphs of messages if they contain non-standard characters (I can still see the text in 'View Message Details'. Example message ... (taken from 'View Message Details')
Return-Path: xxx@xxx.xx
X-Original-To: xxx@xxx.xx
Delivered-To: xxx@xxx.xx
Received: from localhost (localhost [127.0.0.1])
by xxx.xxx.xx(Postfix) with ESMTP id AE17F1F4F5
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:13 +0200 (CEST)
Received: from xxx.xxx.xx([127.0.0.1])
by localhost (xxx.xxx.xx[127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id 2JpDCjQuR-kV for xxx@xxx.xx;
Tue, 17 Jun 2014 15:50:02 +0200 (CEST)
Received: from xxx.xx(xxx.xx[xx.xx.xx.xx])
(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
(No client certificate requested)
by xxx.xxx.xx(Postfix) with ESMTPS id 6BD2C1F2EC
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:02 +0200 (CEST)
Received: from xxx.xx(xxx.xx[212.18.32.15])
by xxx.xx(Postfix) with ESMTP id 47B4481377
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:01 +0200 (CEST)
Received: from xxx.xx(localhost [127.0.0.1])
by xxx.xx(Postfix) with ESMTP id 45979C9483
for xx@xxx.si; Tue, 17 Jun 2014 15:50:01 +0200 (CEST)
X-Virus-Scanned: amavisd-new at amis.net
Received: xxx.xx([127.0.0.1])
by xxx.xx(xxx.xx[127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id Jdt6NMSb6lqT for xx@xxx.xx;
Tue, 17 Jun 2014 15:50:00 +0200 (CEST)
Received: from xxx.xx(xxx.xx[IPv6:xxx])
by xxx.xx(Postfix) with ESMTP id 470FEC9440
for toni@klicnicenter.si; Tue, 17 Jun 2014 15:50:00 +0200 (CEST)
Received: from xxx (xxx.xx[xx.xx.xx.xx])
by xxx.xx(Postfix) with ESMTP id 201E9C2DA8
for xx@xxx.xx; Tue, 17 Jun 2014 15:49:59 +0200 (CEST)
From: =?iso-8859-2?Q?Finan=E8na_hi=B9a_d.o.o.=2C_Call_center?= xxx@xxx.xx
To: xxx@xxx.xx
References: 001601cf8a2c$f42233b0$dc669b10$@sic3dca170f6af52d6c56da7f54499e47d.squirrel@xxx.xxx.xx
In-Reply-To: c3dca170f6af52d6c56da7f54499e47d.squirrel@xxx.xxx.xx
Subject: =?iso-8859-2?Q?RE:PORO=C8ILO-NI_OK?=
Date: Tue, 17 Jun 2014 15:49:28 +0200
Message-ID: 000101cf8a32$f57afe30$e070fa90$@si
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-2"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Ac+KL/SrvJ+H4j1PTiOc388NNaywBAAAvRKg
Content-Language: sl
SE POSVETUJEM Z BIANKO, =C8E KAJ NE BO =A9LO SE JAVIM OZ. JAVIVA.
LP
While I can only see the 'LP' in message view
Last edit: Toni Rutar Lokar 2014-06-20
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you would, please pull the message from your mail spool directly and attach, preferably as a tarball or zip file, etc.
Also, specify what your configuration settings are for $default_charset, $lossy_encoding and $squirrelmail_default_language and what language has been selected for the user account that is viewing the message.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
... while user language was selected as Slovenian.
config_default.php is not used by SquirrelMail at run time. If you don't have config.php then you are using a third party adaptation of SquirrelMail which we do not support.
Moreover, my testing shows the message is OK; looks to me like your SquirrelMail version is out of date (must be most recent from SVN) and/or you have not applied the most recent patch from this tracker.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just confirm the annoying behaviour on Debian Testing,
the problem disappears after commenting out one line /usr/share/squirrelmail/functions/i18n.php, namely
if (! $save_html) $string = sm_encode_html_special_chars ($string);
Hi,
I've put it into test. The patch doesn't fix the issue. When applied, I see wrong characters in headers:
http://ptomulik.meil.pw.edu.pl/quoted-printable-patch-version_2-err1.png
When replying to a message, polish characters get screwed up:
http://ptomulik.meil.pw.edu.pl/quoted-printable-patch-version_2-err3.png
Setting
$default_htmlspecialchars_encoding = 'UTF-8'
doesn't help at all - the problem with disappearing headers comes back:http://ptomulik.meil.pw.edu.pl/quoted-printable-patch-version_2-err2.png
The patch seems to be wrong. For unsupported charsets it simply ignores header's encoding provided to as the
$encoding
argument tosm_encode_html_special_chars()
and assumes its own "fixed encoding" (either ISO-8859-1 or a default provided in config_local.php). In result, text is processed byhtmlspecialchars()
using$encodign
that doesn't match the input text. Even if thehtmlspecialchars()
was able to handle this case correctly, and (say) convert automatically the text to this "fixed" encoding, the text returned bysm_encode_html_special_chars()
is now encoded in this new "fixed" encoding, which is not expected by the caller and further correct text processing is impossible.I think, the only correct way to handle encodings incompatible with
htmlspecialchars()
is to convert the input text to UTF-8 (or any encoding compatible with the input text language andhtmlspecialchars()
), process it withhtmlspecialchars()
and then convert it back to the original encoding defined by theencoding
argument tosm_encode_html_special_chars()
. The performance issue is not an argument here - mail client should simply display messages correctly.I've been testing this too, and it seems to fix the problem for me. Thank you!
For trunk
Paweł says the new patch is causing some characters to get mangled (Paweł, your post was moderated, but it was approved, yet SF seems to have ate it for lunch), which is not surprising, since his text is iso-8859-2 which isn't the same as iso-8859-1, and there are going to be a few inconsistencies (but notably, we have received very little in the way of problem reports about this over 10+ years). Please check your same messages under SM 1.4.22, which uses htmlspecialchars() with no charset argument, therefore, the same mangling should happen. If you want to test with the $default_htmlspecialchars_encoding setting in config_local,php, it makes more sense in your case to try setting it to iso-8859-2. I appreciate the problem, but am not too excited about the solution PHP leaves us with.
Again:
http://www.php.net/manual/en/function.htmlspecialchars.php
The behavior of
htmlspecialchars()
has changed since PHP 5.4. Also note theENT_IGNORE
flag, which has been introduced in PHP 5.3 - maybe the previous versions ofhtmlspecialchars()
worked as if theENT_IGNORE
was in effect and now the function is more choosy?I think the issue will become more important when more users upgrade their servers to PHP 5.4.
Last edit: Paweł Tomulik 2013-07-11
Sorry, of course you can't set it to iso-8859-2 since PHP won't support that. What happens with your messages under SM 1.4.22?
With SM 1.4.22 vanilla + PHP 5.4 headers that use ISO-8859-2 or Windows-1250 appear blank.
Last edit: Paweł Tomulik 2013-07-11
Toni, please review this thread and apply the latest patch I have provided. It will not fix all iso-8859-2 characters, but it will make things look better than any previous version of SquirrelMail. I plan an optional, more thorough fix later.
I was still having problems with the debian wheezy version of SquirrelMail (1.4.23 [SVN]), and by browsing the code, I see the problem persists. Mainly - htmlspecialchars (inside the sm_encode_html_special_chars) fails on input strings in encodings that are not supported by that function. My solution (though certainly a performance hit, and could be alleviated a bit by switch/case as I don't check if the current encoding is supported by htmlspecialchars) was to 'iconv' the string to utf-8 before calling the htmlspecialchars, and return it iconv-ed back to previous encoding. Fixes the problems for mails that I was getting (iso-8859-2, windows-1250).
I basically changed the:
if (check_php_version(5, 2, 3))
return htmlspecialchars($string, $flags, $encoding, $double_encode);
return htmlspecialchars($string, $flags, $encoding);
in the sm_encode_html_special_chars function into
$string=iconv ($encoding, 'UTF-8',$string);
if (check_php_version(5, 2, 3))
$ret=htmlspecialchars($string, $flags, 'UTF-8', $double_encode);
else $ret=htmlspecialchars($string, $flags, 'UTF-8');
return iconv ('UTF-8',$encoding,$ret);
I finally came back to computer and actually had some time to check the posted patches. Unfortunately I don't have ALL the problematic messages any more (especially the windows-1250 ones), but I checked some of the iso-8859-2 which used to make problems and those seem good now. For the moment it looks as though the patch DID fix the problem, but I'll keep watching and re-post if I come across any more problematic messages.
PS: Thanks for the fixes.
I have same problem with 1.4.22 in both... subject and sender mail:
Subject: =?windows-1256?B?yNHkx+PMIMfhw+bUxyDcT1NIQdwg3ewgx+HV5Mfa?=
=?windows-1256?B?x8ogx+Hax+PJINxPU0hBIEdlbmVyYWwgSW5kdXN0cmllc9wg38fR7dLjxw==?=
=?windows-1256?B?IC0gMjItz+3T48jR?=
From: =?windows-1256?B?5tnHxt0gx+HT5s/H5A==?= jobs4sd@gmail.com
both are blank!!???
Can anyone (Paweł?) who is using SM 1.4.23-SVN with the most recent patch I provided and who has some emails with mangled characters please upload the full message source and/or paste in a header containing characters that get manged?
Other people posting here (especially if you are seeing entirely blank subject or from fields) sound as if they aren't using SM 1.4.23-SVN or have not applied the patch.
Sending as attachments the two messages shown here: http://ptomulik.meil.pw.edu.pl/quoted-printable-patch-version_2-err2.png. Hope this helps.
Thanks, but those are not problematic - they display fine. What I'm looking for are messages with characters that don't map back to iso-8859-1 correctly, such as in your "err1" screenshot.
Attached two messages from err1 screen.
The strings that have munged characters in both of those messages are given as unencoded headers despite the fact that they have 8-bit characters in them. That is a problem with the originating software, not a bug in SquirrelMail.
In the case of these two particular messages, it appears they assume it should be utf-8, so if we switch the Polish translation to utf-8 instead of iso-8859-2, these headers will be displayed correctly, however, when you get similarly malformed messages with unencoded iso-8859-2 (or other) characters, you will see the same "problem."
If you want to help change the Polish translation to utf-8, which is a goal for the SquirrelMail project, please be in touch with the translation maintainers and the squirrelmail-i18n mailing list.
For this issue, does anyone have any properly formed messages that are not displayed correctly with the patch I have provided?
I've just checked on my side (1.4.23-SVN + your patch). All messages seem to show up fine. Thank you!
Hello,
I'm hit by the problem. After some tests, I use openSUSE 13.1 version of squirrelmail, given as "1.5.2-13.1.2", with php 5.4.20.
this version write a (no subject) link when it can't display the subject, but there is a subject! and this subject is not copied to the answer when doing one.
the subject is: " =?iso-8859-1?Q?Nouveau=20d=E9compte=20de=20remboursement=20N=B0=20EU1482320?="
if ever it do not display fine here, screen copy: http://dodin.org/owncloud/public.php?service=files&t=0b655c204c7d02d53318b69b00571f5a
I also do not know where to find the last patch - this thread is very long :-(
thanks
jdd
1) We do not support third party SquirrelMail packages
2) Your link is bad
3) Look for quoted_printable_fix-1.5.2-version_2.diff on the previous page, but again, we make no guarantees about its applicability to third party packages
I found there's still a problem with mails in other encodings. While the subjects display properly, I occasionally miss whole first paragraphs of messages if they contain non-standard characters (I can still see the text in 'View Message Details'. Example message ... (taken from 'View Message Details')
Return-Path: xxx@xxx.xx
X-Original-To: xxx@xxx.xx
Delivered-To: xxx@xxx.xx
Received: from localhost (localhost [127.0.0.1])
by xxx.xxx.xx(Postfix) with ESMTP id AE17F1F4F5
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:13 +0200 (CEST)
Received: from xxx.xxx.xx([127.0.0.1])
by localhost (xxx.xxx.xx[127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id 2JpDCjQuR-kV for xxx@xxx.xx;
Tue, 17 Jun 2014 15:50:02 +0200 (CEST)
Received: from xxx.xx(xxx.xx[xx.xx.xx.xx])
(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
(No client certificate requested)
by xxx.xxx.xx(Postfix) with ESMTPS id 6BD2C1F2EC
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:02 +0200 (CEST)
Received: from xxx.xx(xxx.xx[212.18.32.15])
by xxx.xx(Postfix) with ESMTP id 47B4481377
for xxx@xxx.xx; Tue, 17 Jun 2014 15:50:01 +0200 (CEST)
Received: from xxx.xx(localhost [127.0.0.1])
by xxx.xx(Postfix) with ESMTP id 45979C9483
for xx@xxx.si; Tue, 17 Jun 2014 15:50:01 +0200 (CEST)
X-Virus-Scanned: amavisd-new at amis.net
Received: xxx.xx([127.0.0.1])
by xxx.xx(xxx.xx[127.0.0.1]) (amavisd-new, port 10024)
with ESMTP id Jdt6NMSb6lqT for xx@xxx.xx;
Tue, 17 Jun 2014 15:50:00 +0200 (CEST)
Received: from xxx.xx(xxx.xx[IPv6:xxx])
by xxx.xx(Postfix) with ESMTP id 470FEC9440
for toni@klicnicenter.si; Tue, 17 Jun 2014 15:50:00 +0200 (CEST)
Received: from xxx (xxx.xx[xx.xx.xx.xx])
by xxx.xx(Postfix) with ESMTP id 201E9C2DA8
for xx@xxx.xx; Tue, 17 Jun 2014 15:49:59 +0200 (CEST)
From: =?iso-8859-2?Q?Finan=E8na_hi=B9a_d.o.o.=2C_Call_center?= xxx@xxx.xx
To: xxx@xxx.xx
References: 001601cf8a2c$f42233b0$dc669b10$@si c3dca170f6af52d6c56da7f54499e47d.squirrel@xxx.xxx.xx
In-Reply-To: c3dca170f6af52d6c56da7f54499e47d.squirrel@xxx.xxx.xx
Subject: =?iso-8859-2?Q?RE:PORO=C8ILO-NI_OK?=
Date: Tue, 17 Jun 2014 15:49:28 +0200
Message-ID: 000101cf8a32$f57afe30$e070fa90$@si
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-2"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Ac+KL/SrvJ+H4j1PTiOc388NNaywBAAAvRKg
Content-Language: sl
SE POSVETUJEM Z BIANKO, =C8E KAJ NE BO =A9LO SE JAVIM OZ. JAVIVA.
LP
While I can only see the 'LP' in message view
Last edit: Toni Rutar Lokar 2014-06-20
If you would, please pull the message from your mail spool directly and attach, preferably as a tarball or zip file, etc.
Also, specify what your configuration settings are for $default_charset, $lossy_encoding and $squirrelmail_default_language and what language has been selected for the user account that is viewing the message.
(message source was sent privately, testing showed the message was not problematic, moving rest of conversation back here)
config_default.php is not used by SquirrelMail at run time. If you don't have config.php then you are using a third party adaptation of SquirrelMail which we do not support.
Moreover, my testing shows the message is OK; looks to me like your SquirrelMail version is out of date (must be most recent from SVN) and/or you have not applied the most recent patch from this tracker.
I just confirm the annoying behaviour on Debian Testing,
the problem disappears after commenting out one line /usr/share/squirrelmail/functions/i18n.php, namely
if (! $save_html) $string = sm_encode_html_special_chars ($string);
Example of the subject:
Subject: =?iso-8859-2?Q?M=ECs=ED=E8n=ED_souhrn_spr=E1vce?=
switching off $save_html would help as well, but I do not know where to find it in the config.
Please apply the (most recent) patch from this tracker instead.
Hi! Could you tell me which patch is actual for 1.4.23 version?
Last edit: Jurand Bień 2015-06-01