Thread: Re: [Rabbit-proxy-development] Problems when filtering sites
Brought to you by:
ernimril
From: Rick L. <ri...@le...> - 2006-09-25 18:46:57
|
On Fri, 2006-09-22 at 15:04 -0700, rab...@li... wrote: > Rabbit will not filter pages that are already compressed. This is > probably your problem, but since you do not give any example site it is > hard to say. > Adding unpacking+filtering+repacking to FilterHandler is easy, but > it is not part of rabbit, at least not yet. Is it not possible for Rabbit to tell the web server that it cannot accept compressed html? Then Rabbit just does filtering+repacking. That might be faster, if Rabbit has a fast connection and it is cacheing the page. Robo, please correct me. > I guess I have to figure out what direction I want rabbit to go, > full filtering proxy or web accellerator proxy. For the first then I > really ought to add that unpack+filtering to FilterHandler. > If I take number 2 instead I am not sure that I want such features > since they would go directly against rabbits goal (introducing extra > latency is not making surfing faster). It is a big world in the internet, and I am all in favour of filtering out the 'bad' bits. Let's not discuss which bits are bad right now, but those of us who are parents will know what I mean. Maybe we should run Dansguardian upstream of Rabbit. But it might be simpler to do the filtering in Rabbit. cheers -- Rick |
From: Robert O. <ro...@kh...> - 2006-09-25 19:35:13
|
Rick Leir wrote: > Is it not possible for Rabbit to tell the web server that it cannot > accept compressed html? Then Rabbit just does filtering+repacking. > That might be faster, if Rabbit has a fast connection and it is cacheing > the page. Robo, please correct me. That is easy to test and should hopefully work. The speed may be slower or faster, it depends on the bandwidth and the latency to the real server. Probably hard to say if it will be faster or slower. Adding a NoZipFilter that checks all the accept-encoding headers and removes gzip and compress values is probably almost trivial to write. Maybe I will add that in a day or two... Adding unzip+filtering+zipping is also easy, I will try to add that later this week, rabbit is a spare time project so some development is slow. If any of you have patches available and care to share, then please do. /robo |
From: Robert O. <ro...@kh...> - 2006-09-30 19:17:51
|
Robert Olofsson wrote: > Rick Leir wrote: >> Is it not possible for Rabbit to tell the web server that it cannot >> accept compressed html? Then Rabbit just does filtering+repacking. >> That might be faster, if Rabbit has a fast connection and it is cacheing >> the page. Robo, please correct me. I did write one such filter: NoGZipEncoding, I have only tested it lightly, but it seems to work. At least it makes google return non-gzipped data. That filter is not on by default, so remember to add that filter to httpinfilters. There is a first pre-release of 3.6 on the site, please help test it. Have fun. /robo |
From: Fredric P. <fr...@sp...> - 2006-10-09 14:26:35
|
Hello list Do anyone know what I am doing wrong here. I've made my own filter. As long as I put it alongside the other filters in the rabbit.filters package and export my own JAR file it works nicelly. But when I try to use my own class path xx.xxxx.MyFilter It won't load. I added my JAR-file (myFilter.jar) in the rabbit/jars folder I've added this in the config file: [rabbit.handler.FilterHandler] filters=xx.xxxx.MyFilter [xx.xxxx.MyFilter] jsUrl=http://xyz ...and I am starting RabbIT with: java -jar jars/rabbit3.jar -cp jars/myFilter.jar -f conf/rabbit.conf & Mvh, Fredric Palmgren -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.407 / Virus Database: 268.13.1/466 - Release Date: 2006-10-07 |
From: Robert O. <ro...@kh...> - 2006-10-09 16:00:41
|
Fredric Palmgren wrote: > ...and I am starting RabbIT with: java -jar jars/rabbit3.jar -cp > jars/myFilter.jar -f conf/rabbit.conf & Failure to read the man page for java. When you run "java -jar" CLASSPATH and "-cp foo.jar" will be ignored. You can: 1) Change the manifest and update the "Class-Path" entry 2) Start rabbit with something like: java -cp \ jars/rabbit.jar:external_libs/dnsjava-2.0.1.jar:you/filters.jar \ rabbit.proxy.ProxyStarter -f conf/rabbit.conf 3) Make rabbit use a class loader that automatically loads extra jars from RabbIT3/some_directory I hope I will have time to do 3 some day, but it is a low priority thing for me. /robo |
From: Fredric P. <fr...@sp...> - 2006-10-10 11:04:19
|
Hello list, Well I'm back, corrected the blunder with the startup, and that part seems to work nicelly now. While debugging my own filter , I reset everyhinh from the start using the 3.6 binary distro. It seems like I fill my error logs with the same error over and over again: At startup, I get a lot of: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readUTF(DataInputStream.java:565) at java.io.DataInputStream.readUTF(DataInputStream.java:522) at rabbit.http.HttpHeader.read(HttpHeader.java:258) at rabbit.proxy.HttpHeaderFileHandler.read(HttpHeaderFileHandler.java:18) at rabbit.proxy.HttpHeaderFileHandler.read(HttpHeaderFileHandler.java:14) at rabbit.cache.FileData.readData(FileData.java:28) at rabbit.cache.FiledKey.getData(FiledKey.java:54) at rabbit.cache.FiledKey.equals(FiledKey.java:41) at java.util.HashMap.eq(HashMap.java:277) at java.util.HashMap.put(HashMap.java:386) at rabbit.cache.NCache.readCacheIndex(NCache.java:478) at rabbit.cache.NCache.setCacheDir(NCache.java:134) at rabbit.cache.NCache.setup(NCache.java:602) at rabbit.cache.NCache.<init>(NCache.java:76) at rabbit.proxy.HttpProxy.setupCache(HttpProxy.java:214) at rabbit.proxy.HttpProxy.setConfig(HttpProxy.java:284) at rabbit.proxy.HttpProxy.setConfig(HttpProxy.java:152) at rabbit.proxy.ProxyStarter.startProxy(ProxyStarter.java:67) at rabbit.proxy.ProxyStarter.start(ProxyStarter.java:61) at rabbit.proxy.ProxyStarter.main(ProxyStarter.java:19) And when running RabbIT, I get java.lang.NullPointerException at rabbit.handler.BaseHandler.finishData(BaseHandler.java:172) at rabbit.handler.BaseHandler$ContentTransferListener.transferOk(BaseHandler.ja va:452) at rabbit.proxy.TransferHandler$3.run(TransferHandler.java:112) at rabbit.proxy.HttpProxy.runReturnedTasks(HttpProxy.java:487) at rabbit.proxy.HttpProxy.run(HttpProxy.java:398) at java.lang.Thread.run(Thread.java:595) ...and, yes, I am using all the standard jars and config files. . . What am I missing here. Java says "[WARN][convert -/usr/bin/convert- not found, is your path correct?]" when I strat up, perhaps this has something to do with it, but I'm not using the imagefilters... Mvh, Fredric Palmgren -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.407 / Virus Database: 268.13.1/469 - Release Date: 2006-10-09 |
From: Robert O. <ro...@kh...> - 2006-10-10 16:53:48
|
Fredric Palmgren wrote: > It seems like I fill my error logs with the same error over and over again: > At startup, I get a lot of: > java.io.EOFException ... > at rabbit.cache.FileData.readData(FileData.java:28) The cache file format changed and you have an old cache that rabbit is trying to use. "rm -rf /tmp/rcache" or similar to remove it and restart rabbit after that. > java.lang.NullPointerException > at rabbit.handler.BaseHandler.finishData(BaseHandler.java:172) Not sure what this is and my source tree has changed a bit so I am not sure if the line numbers match. I will check. /robo |
From: Robert O. <ro...@kh...> - 2006-10-28 22:03:37
|
Hello! Robert Olofsson wrote: > Adding unzip+filtering+zipping is also easy, I will try to add that > later this week, rabbit is a spare time project so some development > is slow. Ok, this took a bit longer than I thought and I am not finished yet. When doing this I noticed some fundamental problems with the gzip handling. In theory it was badly broken, in practice it works quite well. Anyway I am almost done with a full rewrite of the gzip handling and in this I have also added the possibility to unpack + filter + pack content. I have changed lots of stuff in filtering and gzipping so please tell me what works and what does not work (but note that it is known to break on some sites). Apart from that I have also changed the config file to accept id:s for the handlers: image/gif=rabbit.handler.ImageHandler*gif .... [rabbit.handler.ImageHandler*gif] So now it is possible to call image conversion depending on mime type (or filtering html based on encoding or...). There is a new 3.6 pre release available. It seems to hang on quite a few sites so do not use it for production yet. Another thing to notice: rabbit on java/6-rc + linux/2.6.x does not work well, the nio based selector has been upgraded to a more scalable, but still buggy, one: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6481709 Will hopefully be resolved before java/6 is released. Have fun! /robo |