Re: [fp-dev] [PATCH] proper use of htmlentities
Moved to GitHub: https://github.com/flatpressblog/flatpress
Brought to you by:
real_nowhereman
|
From: NoWhereMan <now...@fl...> - 2006-11-18 08:28:33
|
----- Original Message -----
From: "Naoki Hiroshima"
> NoWhereMan wrote:
[escaping with entities]
>> it looks like a way to unescape sequences of entities... something like
>> that...
>> if it finds &entity; it won't change it into &entity; maybe... :P
>
> I see... Besides, today's WP seems to be using:
>
> $text = str_replace('&&', '&&', $text);
> $text = str_replace('&&', '&&', $text);
> $text = preg_replace('/&(?:$|([^#])(?![a-z1-4]{1,8};))/', '&$1',
> $text);
Code was from 1.5; probably they changed it in 2.0... ok maybe we could sync
some of the code
[explicit accept-charset in forms]
> The thing is, browsers don't have to send a message in the same encoding
> of the page. This used to be a typical problem in Japanese since even
> if the page encoding is "euc-jp", some of browsers can send a message in
> "sjis". Maybe there is no issue nowadays but specifying explicitly is
> better, I suppose.
Oh, I see.
> I don't say there is no reason but I can't think of any compelling
> reason why someone really wants to use something other than UTF-8 unless
> it's UTF-16.
The only reason is retaining native compatibility with SPB which uses
different encodings depending on your language, but I would really drop full
compatibility. Files for the Italian language are already utf encoded (not
ASCII!) so if you use an encoding other than UTF, accents will result in
strange characters.
Maybe I could make the installer detect if the fp-content/content/ is not
empty and ask immediatly if you want to perform a conversion, but the
problem is that I can't guess the actual encoding of the files. I tried with
all the mb_* function series but it looked like it didn't guess anything :(
Moreover I fear server timeouts, and I wouldn't want it to stop in the
middle of a conversion, which would lead you to do a re-conversion next time
you'll launch setup, and so mess with everything: it would try
double-converting to utf-8, treating the already-converted TXTs as european
ISO. I'll let you imagine what would happen.
bye
|