Thread: [Filterproxy-devel] Re: FilterProxy
Brought to you by:
mcelrath
From: Bob M. <mce...@dr...> - 2002-03-04 14:57:10
|
Tomasz Rzad [to...@rz...] wrote: > On Fri, Feb 15, 2002 at 11:11:13AM -0600, Bob McElrath wrote: > > Tomasz Rzad [to...@rz...] wrote: > > > Bob, > > >=20 > > > Is there any way to use FilterProxy in Transparent Proxy Mode? I alwa= ys get > > > "400 URL must be absolute". > >=20 > > I think I have reports that people have set it up this way before. If = you're > > getting that error, FilterProxy thinks it's in server-mode. Are you re= questing > > a URL from filterproxy itself? (i.e. http://host.here:8888/FilterProxy= .html) > > Does it work if you just request some other URL? (not a FilterProxy se= rved > > config page) > >=20 > > Can you tell me how it is set up so far? This would be useful info to = add to > > the README file. >=20 > Do you have any idea how to run service I wrote you before? >=20 > Let me explain it to you again: >=20 > I would like to redirect all request to FilterProxy by doing "ipchains ..= . -j REDIRECT 8888" on my Linux box. > Every time I do that I receive "URL must be absolute" (in my opinion this= comes from UserAgent.pm). > Can you help me at thie matter? I don't know how to do this. But I think I might know why it doesn't work. Your browser thinks it is going to site www.abcd.com and sends a request li= ke this: GET /index.html your ipchains redirects this to FilterProxy, which sees the non-absolute URL /index.html. FilterProxy can't figure out what site you were trying to get because there isn't enough information in the request. Can you turn on "dump headers to log file" on the Header config page, and enable debug on the main page, and send me some of your logs? Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2003-04-23 18:22:00
|
dar santos [ju...@ho...] wrote: > Yes. Got it running. It seems so weird that I rather not spend my time=20 > tracing the proble. I reinstalled woody and used a diffrent http source t= o=20 > update apt-get then there it was. Maybe the initial source I got had some= =20 > un-updated packages. Well anyway. Congrats. Its Impressive and fast knowi= ng=20 > that its based on perl. Its performance is comparable to any C based=20 > apllications.Greatwork. Now i can start finding what I had initially=20 > planned of porting it to win32.=20 I'm glad you like it. :) > Anyway, I noticed the Imagecomp(based on=20 > imagemagick convert I presume). If Im not mistaken Its activation is not = by=20 > default and the user should be the one to make it function. That module was contributed without a config page, and I have never really used it so it is kind of decaying... I wrote a config page for it though and put it in CVS. It will be in the next release, when I get around to it... > I dont know if you would be interested , correct me if Im wrong, the whol= e=20 > idea of http compression (text/html, xml etc) like the implementation of= =20 > mod-gzip would be less significant to those using the dialup. Mainly mode= ms=20 > have hardware compression and there is also software base(Stac,LZS,MPPC).= =20 > To compress an already compressed content(gzip encoded)would be useless i= f=20 > not add overhead to the browsing process. I was able to come accross some= =20 > datas wherein instead of gzip compression html and others are parsed or= =20 > rewritten(I dont know if this should be the right term I use). I tested i= t=20 > several times and its really impressive that size reduction is up to 40-5= 0%=20 > . And still the output is plain html(not compressed.) If you would find= =20 > interest in these I would gladly lookup again the datas and send them. On the contrary, the speedup over a modem is simply astounding. I don't fully understand why, but it is visably faster (by my measurements, 5 times faster or more). You have to have FilterProxy running on a fast-connected server so that it can feed compressed stuff over the modem. Just try it. ;) I think gzip is a more efficent algorithm for compressing text than any used by a modem (typical compression ratios for HTML are 5x to 10x). Not only that but modems suffer from latency. It can take 300ms to fetch a 0 byte file from a server. Image-heavy pages are the pits over a modem. By removing ads, FilterProxy typically reduces the number of connections your browser needs to make to render the page, thereby speeding it up significantly. It is possible to also parse HTML and rewrite it to be smaller. The typical HTML file contains a lot of whitespace, comments, etc that can be removed without changing the appearance of the page. However, parsing HTML is extremely CPU intensive. In my tests it would take several seconds to do this on a modern CPU. There are other tools out there that do this (even perl modules). If you are interested in pursuing this, I would definitely accept such a module, but I think it would be slow. The slowness of parsing HTML is why I chose a regex-based method to strip ads. If I used a full HTML parser (like the perl module HTML::Parser) it would be extremely slow. > Thanks very much and Ill write as soon as I can manage to make your progr= am=20 > run on win32 or win64 that is. Great! ;) Cheers, Bob McElrath [Univ. of Wisconsin at Madison, Department of Physics] "You measure democracy by the freedom it gives its dissidents, not the freedom it gives its assimilated conformists." -- Abbie Hoffman |