From: Tomas K. <to...@us...> - 2003-01-29 11:11:39
Attachments:
recode-iconv-i18n.diff.gz
|
Hi, Attached patch that tries to use internal php functions when converting symbols from other charsets. It can be applied to squirrelmail/functions/i18n.php with 'zcat recode-iconv-i18n.diff.gz | patch' command. Anybody wants to test it? :) In order to test the patch, your php needs to be compiled with recode or iconv support. Otherwise you still use old squirrelmail functions. Recode functions works better if you use non UTF-8 locale. But you may get errors if letter uses charset that is not recognized by recode. iconv functions with non-UTF-8 locale show question marks instead of unknown symbols. with utf-8 locale more letters are translated. And you need to use iconv v.1.5 or newer. If your php has recode and iconv support - recode functions are used. http://www.php.net/manual/en/function.recode-string.php http://www.php.net/manual/en/function.iconv.php Comments and fixes are welcome. Patch does not fix problems with multipart/alternative. -- Tomas |
From: Alexandros V. <av...@no...> - 2003-01-29 11:57:55
|
On Wed, 29 Jan 2003 13:07:46 +0200 (GMT-2) "Tomas Kuliavas" <to...@us...> wrote: > Attached patch that tries to use internal php functions when > converting symbols from other charsets. Very nice! I have applied it in my tree, I don't currently see any problems with it in. And a further idea: why not support international folder names as well (apart from ISO-8859-1 which is already supported)? It can be done in the same fashion: if(function_exists('recode_string') ) { imap_utf7_encode_local_recode() .... } else { // current Squirrely way } This way, not only will universal folder names be supported, but also there are not more software requirements, something you brag about a lot. :) - recode function will only be used when available. Shall I prepare a patch? (I know, it's kind of a new feature for 1.4, but it's not intrusive at all, FWIW I've been using recode for imap_utf_7_??code for months now). -- Alexandros Vellis University of Athens av...@no... Network Operations Centre http://www.noc.uoa.gr/~avel/ Public Key: http://www.noc.uoa.gr/~avel/gpgkey.asc |
From: Tomas K. <to...@us...> - 2003-01-29 12:11:36
|
> On Wed, 29 Jan 2003 13:07:46 +0200 (GMT-2) > "Tomas Kuliavas" <to...@us...> wrote: > >> Attached patch that tries to use internal php functions when >> converting symbols from other charsets. > > Very nice! I have applied it in my tree, I don't currently see any > problems with it in. here is one problem. OE can create Hebrew emails with "iso-8859-8-i" /* catch iso-8859-8-i from OE */ if ($charset = "iso-8859-8-i") $charset = "iso-8859-8" ; > And a further idea: why not support international folder names as well > (apart from ISO-8859-1 which is already supported)? It can be done in > the same fashion: > > if(function_exists('recode_string') ) { > imap_utf7_encode_local_recode() .... > > } else { > // current Squirrely way > > } > > This way, not only will universal folder names be supported, but also > there are not more software requirements, something you brag about a > lot. :) - recode function will only be used when available. > > Shall I prepare a patch? (I know, it's kind of a new feature for 1.4, > but it's not intrusive at all, FWIW I've been using recode for > imap_utf_7_??code for months now). you can always create a patch. it is possible that it will be introduced to official SM only in 1.4.1cvs. -- Tomas |
From: Alexandros V. <av...@no...> - 2003-02-05 10:38:48
|
On Wed, 29 Jan 2003 14:07:41 +0200 (GMT-2) "Tomas Kuliavas" <to...@us...> wrote: > > On Wed, 29 Jan 2003 13:07:46 +0200 (GMT-2) > > "Tomas Kuliavas" <to...@us...> wrote: > > > >> Attached patch that tries to use internal php functions when > >> converting symbols from other charsets. Here's another problem, Tomas. With this patch (using recode()), I get the entities twice, and it screws up the message display. That is, instead of " , I get &quot; Is there any other place in Squirrel that entity processing/substitution takes place? FWIW, I used this: $request = $charset . '..' . $languages[$sm_notAlias]['CHARSET']; instead of this $request = $charset . '..html'; And it now works ok. Alexandros |
From: Tomas K. <to...@us...> - 2003-02-05 12:12:59
|
> On Wed, 29 Jan 2003 14:07:41 +0200 (GMT-2) > "Tomas Kuliavas" <to...@us...> wrote: > >> > On Wed, 29 Jan 2003 13:07:46 +0200 (GMT-2) >> > "Tomas Kuliavas" <to...@us...> wrote: >> > >> >> Attached patch that tries to use internal php functions when >> >> converting symbols from other charsets. > > > Here's another problem, Tomas. With this patch (using recode()), I get > the entities twice, and it screws up the message display. That is, > instead of " , I get &quot; > > Is there any other place in Squirrel that entity processing/substitution > takes place? htmlentities and htmlspecialchars functions are used in some places. > FWIW, I used this: > > $request = $charset . '..' . $languages[$sm_notAlias]['CHARSET']; > > instead of this > > $request = $charset . '..html'; > > And it now works ok. Only if your charset knows decoded symbols. If symbols are unknown they are translated to latin or removed. http://www.topolis.lt/squirrelmail/patches/20021027/test_utf8.eml We need to pass -d option to recode function (don't know if it is possible) or ---- /* try to use recode functions */ if ( function_exists('recode_string') ) { $request = $charset . '..html'; $string = recode_string($request,$string); $string = str_replace("&", '&', $string); return $string; } ---- Plus I think I need to add additional tests in order to disable decoding in some cases or go back to old way. -- Tomas |
From: Alexandros V. <av...@no...> - 2003-02-05 13:25:59
|
On Wed, 5 Feb 2003 14:08:28 +0200 (GMT-2) "Tomas Kuliavas" <to...@us...> wrote: > htmlentities and htmlspecialchars functions are used in some places. aha... Could checkes be made in these places, in the same fashion (function_exists("recode_string"))? If it has been previously encoded properly, there is no need to do it again, right? And this only affects the message body, and probably the header lines such as subject and From. > Only if your charset knows decoded symbols. If symbols are unknown > they are translated to latin or removed. > > http://www.topolis.lt/squirrelmail/patches/20021027/test_utf8.eml I see... My thought was to just let the character conversion *alone* for this part, and leave out the entities for later. > We need to pass -d option to recode function (don't know if it is > possible) or > ---- > /* try to use recode functions */ > if ( function_exists('recode_string') ) { > $request = $charset . '..html'; > $string = recode_string($request,$string); > $string = str_replace("&", '&', $string); Here, all HTML entities should be replaced, like " etc. > return $string; > } > ---- > > Plus I think I need to add additional tests in order to disable > decoding in some cases or go back to old way. I hope not. I like recode. :-) I'm having some more troubles lately with i18n stuff, I'll post another mail a bit later. (It's pizza time now :-)) Greetings Alexandros |