[Rabbit-proxy-users] Re: problems on Japanese multibyte environment
Brought to you by:
ernimril
From: Robert O. <d9...@na...> - 2003-01-26 11:16:53
|
On Tue, 21 Jan 2003, Hiroki WAKABAYASHI wrote: > Okay, here is an example page http://biztech.nikkeibp.co.jp (tech info > page from nikkei). The proxy "does filter" the page, but the text > filtered by proxy gets screwed up under iso-2022-jp . Hmmm: GET http://biztech.nikkeibp.co.jp/ HTTP/1.1 HTTP/1.1 200 OK Server: Netscape-Enterprise/3.6 SP3 Date: Sun, 26 Jan 2003 11:03:26 GMT Content-type: text/html Connection: close Age: 2 Via: HTTP/1.1 RabbIT The server says that this is normal text/html, no encodding. So treating the page like ISO8859-1 (latin-1) seems ok from the proxys point of view. This also means that your extra filters will not handle this page. One working solution is to add the dontfilterfilter and add this page, the other solution is to educate nikkei to use correct encodings in there responses. somehting like this: [Filters] httpinfilters=rabbit.filter.HTTPBaseFilter,rabbit.filter.DontFilterFilter [rabbit.filter.DontFilterFilter] dontFilterURLmatching=biztech.nikkeibp.co.jp /robo |