[Rabbit-proxy-users] Re: problems on Japanese multibyte environment

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Tue, 21 Jan 2003, Hiroki WAKABAYASHI wrote:

> Okay, here is an example page  http://biztech.nikkeibp.co.jp (tech info
> page from nikkei). The proxy "does filter" the page,  but the text
> filtered by proxy gets screwed up under iso-2022-jp .

Hmmm:
GET http://biztech.nikkeibp.co.jp/ HTTP/1.1
HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.6 SP3
Date: Sun, 26 Jan 2003 11:03:26 GMT
Content-type: text/html
Connection: close
Age: 2
Via: HTTP/1.1 RabbIT

The server says that this is normal text/html, no encodding. So
treating the page like ISO8859-1 (latin-1) seems ok from the proxys
point of view.

This also means that your extra filters will not handle this page.
One working solution is to add the dontfilterfilter and add this
page, the other solution is to educate nikkei to use correct
encodings in there responses.

somehting like this:
[Filters]
httpinfilters=rabbit.filter.HTTPBaseFilter,rabbit.filter.DontFilterFilter

[rabbit.filter.DontFilterFilter]
dontFilterURLmatching=biztech.nikkeibp.co.jp

/robo