rabbit-proxy-development Mailing List for RabbIT proxy (Page 34)
Brought to you by:
ernimril
You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
(10) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2005 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2006 |
Jan
(5) |
Feb
(3) |
Mar
(8) |
Apr
(15) |
May
(1) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
(8) |
Oct
(11) |
Nov
(4) |
Dec
(18) |
2007 |
Jan
(16) |
Feb
(21) |
Mar
(5) |
Apr
(57) |
May
(71) |
Jun
(90) |
Jul
(110) |
Aug
(33) |
Sep
(11) |
Oct
(5) |
Nov
(39) |
Dec
(5) |
2008 |
Jan
(1) |
Feb
(2) |
Mar
(6) |
Apr
(178) |
May
(128) |
Jun
(119) |
Jul
(85) |
Aug
(70) |
Sep
(47) |
Oct
(7) |
Nov
(38) |
Dec
(80) |
2009 |
Jan
(58) |
Feb
(24) |
Mar
(51) |
Apr
(49) |
May
(108) |
Jun
(98) |
Jul
(71) |
Aug
(22) |
Sep
(17) |
Oct
(28) |
Nov
(9) |
Dec
(11) |
2010 |
Jan
(5) |
Feb
(4) |
Mar
(37) |
Apr
(61) |
May
(71) |
Jun
(67) |
Jul
(39) |
Aug
(57) |
Sep
(16) |
Oct
(3) |
Nov
(1) |
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Michael V. <mi...@vo...> - 2006-04-16 00:32:20
|
Hello, I was looking around for a proxy server implemented in Java, as basis for an experimental project. RabbIT seems to be the most actively developed, as far as I could find at least -- so first of all, congratulations on a great project! How are you all using it? As a "classic" proxy, like Squid etc.? I am interested in use it for building a proxy server that would run locally on workstations, mostly laptops that are frequently offline (network disconnected). How could we get RabbIT to cache all visited pages for this purpose, even if the server (or upstream proxy) say they can/should not??=20 Based on that base functionality (which, to be honest, I was hoping to find, not implement), I am thinking about some extensions, such as "queuing" requests for not-yet-cached pages requested during offline operation, then "batch-prefetching" when online, and some more ideas in that direction... interested? BTW: The classic http://www.gedanken.demon.co.uk/wwwoffle/ has such offline functionality, but I was hoping to find something in Java to extend it more quickly for a POC... At http://www.proxy-offline-browser.com/ there is something like that too, but commercial, without sources. The http://www.almaden.ibm.com/cs/wbi/ could probably also be used as basis for stuff like this, but I haven't looked more closely yet mostly because of the license. Maybe somebody here has seen other things in this direction? Regards, Michael PS: Some quick feedback on the built-in GUI that may be of interest: I at first kept trying all sorts of things with http://localhost:9666/ until I realized that it had to be http://MYMACHINENAME:9666 - maybe the doc could state this more clearly, or even better, instead of the error message that currently shows up when accessing as localhost it could say (or even just redirect?!) to use the real hostname? Also, minor really, the LogRotator link on top goes "Couldnt find class:rabbit.meta.LogRotator, java.lang.ClassNotFoundException: rabbit.meta.LogRotator" and Config says "File 'config\index.html' not found." (All this was on Rabbit 3.0.) |
From: Robert O. <ro...@kh...> - 2006-04-14 13:48:14
|
Matej Mihelic wrote: > I think the Rabbit performs incorrect handshake over SSL when combined > with user authentification. It seems to work for me. I get: HTTP_CODE: 301 both if I specify --proxy-user or not. Note: rabbit does not run any http filters on CONNECT requests (that is probably a bug). Rabbit do run the ip filters though. > I get the following line in access.log: > 172.16.33.70 - - 04/apr/2006:12:39:11 GMT "CONNECT > updates.mozilla.org:443 HTTP/1.0" 200 - Seems normal. Note that the status code in rabbits access_log is the status code for rabbit, not the status code from the real server. Rabbit handled this connection without problems so it is a 200 Ok. If the resource had a http header with a status of 500, 404, 301 or 200 does not really matter to rabbit. > And the following line in the error.log: > [04/apr/2006:12:44:33 GMT][WARN][Tunnel: failed to handle: > java.io.IOException: An existing connection was forcibly closed by the > remote host] Yes the tunnel does not always understand nicely when the connection is closed. So sometimes it logs. This is not a problem. > Without Rabbit3 in between I'll get the following HTTP CODE: 301 /robo |
From: <mat...@ne...> - 2006-04-05 07:59:22
|
Samat Jain wrote: > > Matej Miheli wrote: > > * The proliferation of satellite and mobile (3G GSM - EDGE, UMTS) > > connections with very high latency (500ms+) requires a > > minimisation of requests sent from client to server. > > Just wondering, does not Firefox's proxy pipelining accomplish the same > thing? > > That is, sending multiple requests and receiving multiple responses on > the same connection, without having to modify the page? > > Samat > That's the theory. It requires active (not just accepting) pipelining support in servers, proxy and clients. It also depends on a number of object sent through the pipeline. You still have to fetch (from browser/proxy) most of the external objects, since only the server knows what objects are in the page before it is parsed by the client (browser/proxy). You can test the examples that I have prepared and see if there are any results. In the real world scenario I think embedding would work better since the proxy would prefetch and assemble most of the page before it is returned to the server. In the proxy pipelining scenario you would have to download the page to the browser, browser would parse it and then combine and send the requests to the proxy and wait for external objects. In embedding scenario browser would send request to the proxy, proxy would fetch the page, parse it and fetch external objects and embed them in page and return the result to the browser. > P.S. Sorry for destroying the threading, I just subscribed to this list > and don't have access to the archive except through Sourceforge's web > interface. > > -- > Samat Jain <http://www.samat.org/> > > > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking scripting > language > that extends applications into web and mobile media. Attend the live > webcast > and join the prime developer group breaking into this new coding > territory! > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > <http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642> > _______________________________________________ > Rabbit-proxy-development mailing list > Rab...@li... > https://lists.sourceforge.net/lists/listinfo/rabbit-proxy-development > |
From: Samat J. <li...@sa...> - 2006-04-05 07:28:47
|
Matej Miheli wrote: > * The proliferation of satellite and mobile (3G GSM - EDGE, UMTS) > connections with very high latency (500ms+) requires a > minimisation of requests sent from client to server. Just wondering, does not Firefox's proxy pipelining accomplish the same thing? That is, sending multiple requests and receiving multiple responses on the same connection, without having to modify the page? Samat P.S. Sorry for destroying the threading, I just subscribed to this list and don't have access to the archive except through Sourceforge's web interface. -- Samat Jain <http://www.samat.org/> |
From: Samat J. <li...@sa...> - 2006-04-05 07:16:44
|
> > There is one more problem regarding Firefox and incomplete pages in > > connection with time-outs. It looks like sometimes the connection will > > simply hang and then an incomplete page is returned and cached by the > > Firefox. I don't exactly know what is happening but I've not seen such > > behaviour with any other proxy and I have a few experiences. > > Hmmm, I have not seen this. Do you have a web site where it usually > happens? I've this happen several times on the site http://www.webhostingtalk.com/, particularly on the post listing pages and the post themselves. I've yet to see a pattern why, though... The pages do not load particularly slow, for example. I don't recall getting it from 2.0, and I am still using 3.0. Samat -- Samat Jain <http://www.samat.org> |
From: Matej M. <rab...@ma...> - 2006-04-05 06:46:24
|
Robert Olofsson wrote: RO> Matej Mihelic wrote: MM> [MM] I'll try to find a pattern. It is probably connected with MM> me overloading the Rabbit. My usual browsing habbits include MM> simultaneous opening of 30 tabs in Firefox :). RO> I would not think so. Rabbit should handle many concurrent RO> connections and unless you have changed your firefox config RO> you are only using 4. Check "about:config" and RO> network.http.max-persistent-connections-per-proxy, I have mine RO> set to 8 at the moment. Also make sure that you have RO> proxy.keep-alive and proxy pipelining set to true. RO> Rabbit does not yet handle client side pipelining but I plan RO> on handling that soon. I have LOWERED my settings to match your. I was pipelining up to 16 concurrent requests. In general I work on a 4Mbps+ uplink. I have also upgraded my Firefox 1.5.1 to latest beta. I have a filling that at least one of the problems is actually Firefox's. It is now working better. I have retested with 32 concurrent pipelining requests and it seems to be work better as well. However this is not a complete test and it could be a coincidence. MM> [MM] That's what I meant. This one is very common due to people not MM> thinking what it is written in MS ASP documentation. There is an MM> example specifying -1 for expires property. And they are using it anywhere. RO>I'll check how rabbit handle them and see what we can do. MM> RO> This could be easy or hard, depending on what you mean. MM> RO> Blocking the CONNECT request is trivial, but what happens MM> RO> on an tunneled and encrypted connection is not something MM> RO> rabbit can filter. MM> MM> [MM] Yes. This would suffice. It would allow for filtering MM> sites that are trying to enforce HTTPS. RO> Ok, then it is simple. Ill see what I can do. But please note RO> that my immediate action is only to put it in the TODO-file RO> (spare time project!). RO> /robo Thanks. I'll install JDK environment on my notebook. Perhaps I can find some spare time as well. Unfortunately due to my lack of skills and knowledge of Java this won't be very productive. -- Regards, Matej. |
From: Robert O. <ro...@kh...> - 2006-04-04 19:50:57
|
Matej Mihelic wrote: > *** Log errors: > [04/apr/2006:11:01:58 GMT][WARN][BaseHandler: error handling request: > java.io.IOException: An established connection was aborted by the > software in your host machine > at sun.nio.ch.SocketDispatcher.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(Unknown Source) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) > at sun.nio.ch.IOUtil.write(Unknown Source) > at sun.nio.ch.SocketChannelImpl.write(Unknown Source) > at sun.nio.ch.FileChannelImpl.transferToTrustedChannel(Unknown Source) > at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source) > at > rabbit.proxy.FileResourceSource.transferTo(FileResourceSource.java:59) I have seen this one as well, it does not seem to generate any problems. But until I know the exact cause of this the warning will stay. One cause may be that you load only half a page and then move on, causing the browser to abort that download. There seems to be more causes for this though. > *** Console errors: > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(Unknown Source) > at java.util.HashMap$KeyIterator.next(Unknown Source) > at java.util.Collections$UnmodifiableCollection$1.next(Unknown > Source) > at rabbit.proxy.HttpProxy.cancelTimeouts(HttpProxy.java:402) > at rabbit.proxy.HttpProxy.run(HttpProxy.java:383) > at java.lang.Thread.run(Unknown Source) Ok, I have not seen this one. Good to hear about it. I currently have no idea what caused it. But at least rabbit should continue, possibly keeping one connection open until the next round so it is not a big problem. /robo |
From: Robert O. <ro...@kh...> - 2006-04-04 19:40:10
|
Matej Mihelic wrote: > [MM] I'll try to find a pattern. It is probably connected with me > overloading the Rabbit. My usual browsing habbits include simultaneous > opening of 30 tabs in Firefox :). I would not think so. Rabbit should handle many concurrent connections and unless you have changed your firefox config you are only using 4. Check "about:config" and network.http.max-persistent-connections-per-proxy, I have mine set to 8 at the moment. Also make sure that you have proxy.keep-alive and proxy pipelining set to true. Rabbit does not yet handle client side pipelining but I plan on handling that soon. > [MM] That's what I meant. This one is very common due to people not > thinking what it is written in MS ASP documentation. There is an example > specifying -1 for expires property. And they are using it anywhere. Ill check how rabbit handle them and see what we can do. > > This could be easy or hard, depending on what you mean. > > Blocking the CONNECT request is trivial, but what happens on an tunneled > > and encrypted connection is not something rabbit can filter. > > [MM] Yes. This would suffice. It would allow for filtering sites that > are trying to enforce HTTPS. Ok, then it is simple. Ill see what I can do. But please note that my immediate action is only to put it in the TODO-file (spare time project!). /robo |
From: Matej M. <rab...@ma...> - 2006-04-04 12:48:51
|
* 060404: BUG REPORT - Rabbit pre-3.1 error messages - incorrect handling of user authorisation for SSL connections I think the Rabbit performs incorrect handshake over SSL when combined with user authentification. An example: curl -o t.txt -w "HTTP_CODE: %{http_code}" -k --proxy proxy:port --proxy-user "user:pass" https://updates.mozilla.org HTTP_CODE: 000curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol I get the following line in access.log: 172.16.33.70 - - 04/apr/2006:12:39:11 GMT "CONNECT updates.mozilla.org:443 HTTP/1.0" 200 - And the following line in the error.log: [04/apr/2006:12:44:33 GMT][WARN][Tunnel: failed to handle: java.io.IOException: An existing connection was forcibly closed by the remote host] Without Rabbit3 in between I'll get the following HTTP CODE: 301 To file a complete report: If i open the same page over the HTTP address I'll get no message in the error.log and with the following line in the access.log: 172.16.33.70 - user 04/apr/2006:12:43:59 GMT "GET http://updates.mozilla.org HTTP/1.1" 200 - |
From: Matej M. <rab...@ma...> - 2006-04-04 11:59:38
|
* 060404: Rabbit pre-3.1 error messages I am having the following two messages that I can not diagnose. One appears in log files and the other one on the console. *** Log errors: [04/apr/2006:11:01:58 GMT][WARN][BaseHandler: error handling request: java.io.IOException: An established connection was aborted by the software in your host machine at sun.nio.ch.SocketDispatcher.write0(Native Method) at sun.nio.ch.SocketDispatcher.write(Unknown Source) at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) at sun.nio.ch.IOUtil.write(Unknown Source) at sun.nio.ch.SocketChannelImpl.write(Unknown Source) at sun.nio.ch.FileChannelImpl.transferToTrustedChannel(Unknown Source) at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source) at rabbit.proxy.FileResourceSource.transferTo(FileResourceSource.java:59) at rabbit.proxy.TransferHandler.run(TransferHandler.java:42) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ] *** Console errors: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(Unknown Source) at java.util.HashMap$KeyIterator.next(Unknown Source) at java.util.Collections$UnmodifiableCollection$1.next(Unknown Source) at rabbit.proxy.HttpProxy.cancelTimeouts(HttpProxy.java:402) at rabbit.proxy.HttpProxy.run(HttpProxy.java:383) at java.lang.Thread.run(Unknown Source) |
From: Matej M. <rab...@ma...> - 2006-04-04 07:57:10
|
> -----Original Message----- > From: Robert Olofsson [mailto:ro...@kh...] > Sent: 03. april 2006 22:34 > To: Matej Miheli=E8 > Cc: rab...@li... > Subject: Re: [Rabbit-proxy-development] 060403: Pre-3.1 imressions &=20 ideas > for the future releases [MM] ... > > There is one more problem regarding Firefox and incomplete pages in > > connection with time-outs. It looks like sometimes the connection wi= ll > > simply hang and then an incomplete page is returned and cached by th= e > > Firefox. I don't exactly know what is happening but I've not seen su= ch > > behaviour with any other proxy and I have a few experiences. > > Hmmm, I have not seen this. Do you have a web site where it usually > happens? [MM] I'll try to find a pattern. It is probably connected with me=20 overloading the Rabbit. My usual browsing habbits include simultaneous=20 opening of 30 tabs in Firefox :). > > There is a condition that you may want to handle gracefully. Some > > webmasters use and send "-1" as a value for "Expires" header. I know= > > this is a violation of the RFC but it is common. You should handle i= t > > gracefully. It means don't cache or already expired. > > Maybe, but note that there are _very_ many different expires that > violate the spec. I could try to make rabbit be a bit nicer with -1 an= d > already expired entries when it is run in non-strict mode. [MM] That's what I meant. This one is very common due to people not=20 thinking what it is written in MS ASP documentation. There is an example = specifying -1 for expires property. And they are using it anywhere. > > If you would implement the ad blocking and URL blocking in a way tha= t > > would enable user to use a pre-prepared block lists you would really= > > help them. The most popular ad-block list would be the one for the > > ADBlock Firefox extension - G-Filter sets ( > > http://www.pierceive.com/filtersetg/ ). They are formed in two separ= ate > > lists: black list and white list. By implementing the import from th= em > > in the Rabbit you would really help the users. > > This ought to be simple enough. Rabbit currently uses one regexp and > theese lists seems to be a set of regexps. [MM] I am glad to hear this. This really looks vary useful to me. > > * Enhance "rabbit.filter.BlockFilter" to blocks HTTPS URLs as wel= l. > > I have explained in a previous message why I find this importan= t. > > This could be easy or hard, depending on what you mean. > Blocking the CONNECT request is trivial, but what happens on an tunnel= ed > and encrypted connection is not something rabbit can filter. [MM] Yes. This would suffice. It would allow for filtering sites that=20 are trying to enforce HTTPS. =20 > > * When possible embed external files into the HTML using the > > RFC-2397 data URI scheme > > Ugh, Sounds like this will take some time, if I am to do it. You do no= t > have a patch ready? ;-). [MM] I get the back tone :) Your project is the first one in a long time = that is interesting to me. But I am no Java coder - actually I am no=20 programmer at all. If there were something to do with DB/SQL backend=20 than it would be easy for me to help. More important reason for me not=20 immediately offering help is that I have decided to steer as far away=20 from computers in my spare time as I possibly can. I am not very=20 successful in this resolution but I try to. =20 > > I know that data URI is in general limited to 1024-4096 bytes > > Rabbit currently has a 4k buffer so that is the current maximum uri fo= r > rabbit. I plan to make that growable as needed in the future. I believ= e > that it ought to stay limited though. > > > * I would also replace HREFs that are pointing to > > (adfiler/blockfilter) blocked URLs with one of the followin= g > ones: > > o HREF to a fixed error page at the RABBIT server. This= > > would allow for caching of the response. > > o HREF=DAta:,Rabbit%20denied%20this%20page > > What do you think that this does: > adreplacer=3Dhttp://$proxy/FileSender/public/NoAd.gif [MM] Ah... How easy is to get carried away and not thinking things=20 through twice.. Yes. You are right. > So for ad-filtering this already works. Adding blocked site replacemen= t > could be tricky, due to me wanting to have the real site name in the > "this page '<page url>' is blocked by rabbit configuration..." [MM] Well, you could embed the message in data URI. But this is a=20 function od ad-blocking - isn't it? Perhaps a variable with a block URL=20 would be usefull?. It would allow for the following syntax: adreplacer=3Ddata:text/html,<HTML><BODY><B>Rabbit3</B> denied this page -= =20 <A href=3D"$1noproxy.$2">Click here for unfiltered page</A></BODY></HTML>= > > Then there is one last proposal. You could implement SSL filtering a= s > > well. Proxomitron is a great example how it could be done. It users > > temporary SSL key between client and proxy and temporary or predefin= ed > > SSL certificates when communicating with remote servers. > > Maybe, again, this is probably something that will take time. Rabbit i= s > a spare time project so patches are very welcome. > > Many nice ideas, I like it. [MM] I must say that I am tempted. We will see. > Thanks > /robo |
From: Robert O. <ro...@kh...> - 2006-04-03 20:34:40
|
Matej Miheli=E8 wrote: > I have tested the pre-3.1 version and it did fix most of my problems. I= t=20 > fixed all the reported ones. Glad to hear that. > There is one more problem regarding Firefox and incomplete pages in=20 > connection with time-outs. It looks like sometimes the connection will=20 > simply hang and then an incomplete page is returned and cached by the=20 > Firefox. I don't exactly know what is happening but I've not seen such=20 > behaviour with any other proxy and I have a few experiences. Hmmm, I have not seen this. Do you have a web site where it usually happens? > There is a condition that you may want to handle gracefully. Some=20 > webmasters use and send "-1" as a value for "Expires" header. I know=20 > this is a violation of the RFC but it is common. You should handle it=20 > gracefully. It means don't cache or already expired. Maybe, but note that there are _very_ many different expires that violate the spec. I could try to make rabbit be a bit nicer with -1 and already expired entries when it is run in non-strict mode. > If you would implement the ad blocking and URL blocking in a way that=20 > would enable user to use a pre-prepared block lists you would really=20 > help them. The most popular ad-block list would be the one for the=20 > ADBlock Firefox extension - G-Filter sets (=20 > http://www.pierceive.com/filtersetg/ ). They are formed in two separate= =20 > lists: black list and white list. By implementing the import from them=20 > in the Rabbit you would really help the users. This ought to be simple enough. Rabbit currently uses one regexp and=20 theese lists seems to be a set of regexps. > * Enhance "rabbit.filter.BlockFilter" to blocks HTTPS URLs as well. > I have explained in a previous message why I find this important. This could be easy or hard, depending on what you mean. Blocking the CONNECT request is trivial, but what happens on an tunneled=20 and encrypted connection is not something rabbit can filter. > * When possible embed external files into the HTML using the > RFC-2397 data URI scheme=20 Ugh, Sounds like this will take some time, if I am to do it. You do not=20 have a patch ready? ;-). > I know that data URI is in general limited to 1024-4096 bytes Rabbit currently has a 4k buffer so that is the current maximum uri for=20 rabbit. I plan to make that growable as needed in the future. I believe=20 that it ought to stay limited though. > * I would also replace HREFs that are pointing to > (adfiler/blockfilter) blocked URLs with one of the following o= nes: > o HREF to a fixed error page at the RABBIT server. This > would allow for caching of the response. > o HREF=3Ddata:,Rabbit%20denied%20this%20page What do you think that this does: adreplacer=3Dhttp://$proxy/FileSender/public/NoAd.gif So for ad-filtering this already works. Adding blocked site replacement=20 could be tricky, due to me wanting to have the real site name in the=20 "this page '<page url>' is blocked by rabbit configuration..." > Then there is one last proposal. You could implement SSL filtering as=20 > well. Proxomitron is a great example how it could be done. It users=20 > temporary SSL key between client and proxy and temporary or predefined=20 > SSL certificates when communicating with remote servers. Maybe, again, this is probably something that will take time. Rabbit is=20 a spare time project so patches are very welcome. Many nice ideas, I like it. Thanks /robo |
From: <mat...@ne...> - 2006-04-03 14:07:22
|
* 060403: Pre-3.1 impressions & ideas for the future releases I have tested the pre-3.1 version and it did fix most of my problems. It fixed all the reported ones. There is one more problem regarding Firefox and incomplete pages in connection with time-outs. It looks like sometimes the connection will simply hang and then an incomplete page is returned and cached by the Firefox. I don't exactly know what is happening but I've not seen such behaviour with any other proxy and I have a few experiences. There is a condition that you may want to handle gracefully. Some webmasters use and send "-1" as a value for "Expires" header. I know this is a violation of the RFC but it is common. You should handle it gracefully. It means don't cache or already expired. As for my suggestions for the future I would like to suggest the following enhancements that would improve user experience in the following two regions: * Ads are really a bandwidth hogs and are annoying. * The proliferation of satellite and mobile (3G GSM - EDGE, UMTS) connections with very high latency (500ms+) requires a minimisation of requests sent from client to server. If you would implement the ad blocking and URL blocking in a way that would enable user to use a pre-prepared block lists you would really help them. The most popular ad-block list would be the one for the ADBlock Firefox extension - G-Filter sets ( http://www.pierceive.com/filtersetg/ ). They are formed in two separate lists: black list and white list. By implementing the import from them in the Rabbit you would really help the users. My suggestions for new ad-blocking features would be the following: * Implement white-list and black-list definitions for the "rabbit.filter.BlockFilter" as well as for the "rabbit.filter.AdFilter" ("blockURLmatching" and "DontBlockURLmatching"). This would allow for more a more relaxed filtering with optional white-listing of certain sites. * Besides current format for filters allow for read patterns from file in the G-Filter lists format ("blockURLmatchingFile" and "DontBlockURLmatchingFile"). You should probably convert the patterns to common format and merge them with the ones from "blockURLmatching"/"DontBlockURLmatching". * Enhance "rabbit.filter.BlockFilter" to blocks HTTPS URLs as well. I have explained in a previous message why I find this important. My suggestions for high-latency links accelerations are the following: * When possible embed external files into the HTML using the RFC-2397 data URI scheme (IMG tag, SCRIPT, STYLE tag - you fetch the file from SRC/HREF and replace it) . References: ( http://en.wikipedia.org/wiki/Data:_URL, http://www.mozilla.org/quality/networking/docs/aboutdata.html ). I know that this is currently only supported through Mozilla and Opera browsers but it would probably help tremendously on high-latency link. There is a way to get partial RFC-2397 support in the IE through protocol handler but it will be limited by the URL connection limit in IE. I've put you a copy of the IE plugin on my server "http://neosys.si/users/Matej/DataProtocol.zip". Examples: - http://neosys.si/users/matej/rabbit/Data_SiOL.net.htm (Opera and Netscape) - http://neosys.si/users/matej/rabbit/Data_IE_SiOL.net.htm (IE - doesn't support data URIs GIF?!?) - http://www.scalora.org/projects/uriencoder/ (original: http://neosys.si/users/matej/rabbit/SiOL.net.htm ) I know that data URI is in general limited to 1024-4096 bytes (Mozilla unlimited) and that it would actually increase the file size and disable the caching effect. This is against current goals but I see the following arguments: o High-latency links have in general a high throughput. o Due to reduced size in JPEG re-compression the files could still be smaller. o Limitation of 4096 bytes would - due to JPEG file reduction - suffice for most sites. But with Firefox this limitation does not exist. o Caching is not so important for pages where you don't browse around the site and there are many new images anyway (news sites). I would suggest the following configuration variables: - enableDataURIforTags=IMG|STYLE|SCRIPT - enableDataURIforObjectsWithExtension=JPG|JPEG|CSS|GIF|JS - maximumSizeForDataURI=16384 ; Firefox can take it. - dontEmbeddDataURIforSites= * I would also replace HREFs that are pointing to (adfiler/blockfilter) blocked URLs with one of the following ones: o HREF to a fixed error page at the RABBIT server. This would allow for caching of the response. o HREF=data:,Rabbit%20denied%20this%20page This would remove the need for round trip to server for the 403 message. Unfortuantely it would mask the destination URL. But since the user can request unfiltered page he can still find it. I know that there would be another option - multipart encoding. But I have no idea how well this is supported accros browsers. Then there is one last proposal. You could implement SSL filtering as well. Proxomitron is a great example how it could be done. It users temporary SSL key between client and proxy and temporary or predefined SSL certificates when communicating with remote servers. -- Best regards, Matej. |
From: Rick L. <rl...@le...> - 2006-03-31 19:53:14
|
On Tue, 28 Mar 2006 rab...@li... wrote: > I have just released version 3.0 of rabbit, the web proxy. Thanks for this, Robo cheers -- Rick |
From: Robert O. <ro...@kh...> - 2006-03-30 20:45:50
|
Matej Miheli=E8 wrote: > * Rabbit 3.0 - Bug report - rabbit.handler.ImageHandler parameters are = no longer honoured Ok, that is a bug. Rabbit/3 dropped the config. Fixed in my tree. Check=20 the next 3.1 pre release. > Please, let me know if you prefer receiving bug reports via SourceForge= bug tracking system?=20 Mail is better for me. sf.net is usually very slow for me so it is a pain to use. Thanks /robo |
From: <Mat...@ne...> - 2006-03-30 20:41:58
|
> -----Original Message----- > From: Robert Olofsson [mailto:ro...@kh...] > Sent: 30. marec 2006 22:24 > To: Matej Miheli=E8 > Cc: rab...@li... > Subject: Re: [Rabbit-proxy-development] Rabbit 3.0 - Bug report - = Uplink > proxy support is not working & ADFilter non expected behaviour & a bit = to > generic default blockURLmatching >=20 [...] >=20 > > - A bit to generic default blockURLmatching >=20 > Could be. I am not sure if the block filter ought to be enable by > default. In rabbit/2 it was off by default. For me the current setting > has worked very well. Can you tell me what sites that are blocked when > they should not be? =20 [MM] Like I wrote it is blocking the largest Slovenian search engine - = http://najdi.si . They are using a URL that matches default pattern as a = redirect link to the original site. Any link in search results that you = try to open will return an access denied message by default in Rabbit3. = It can be an isolated example but it took me just a few minutes of you = to find it. > Adjusting the block filter and leaving it off by default may be a = better > default perhaps? Not sure it will be on for now. If someone else have = an > opinion I would like to hear it. [MM] I think that this is a possible solution. The other one would be to = include a more visible warning in configuration file explaining what = this setting does or can cause. Even better would be to include this = information in the 403 page that Rabbit returns to the user as well. > Thanks > /robo |
From: Robert O. <ro...@kh...> - 2006-03-30 20:23:46
|
Matej Miheli=E8 wrote: > - Uplink proxy support is not working True. Fixed in source now. Ill upload a 3.1-pre later today. > - ADFilter non-expected behaviour Ah, yes, that is probably a bug, I have changed the default pattern to=20 be "[/.]ad[/.]" instead of the empty string. That ought to be a quite=20 safe default. > - A bit to generic default blockURLmatching Could be. I am not sure if the block filter ought to be enable by=20 default. In rabbit/2 it was off by default. For me the current setting=20 has worked very well. Can you tell me what sites that are blocked when=20 they should not be? Adjusting the block filter and leaving it off by default may be a better=20 default perhaps? Not sure it will be on for now. If someone else have an=20 opinion I would like to hear it. Thanks /robo |
From: <Mat...@ne...> - 2006-03-30 16:07:40
|
* Rabbit 3.0 - Bug report - rabbit.handler.ImageHandler parameters are = no longer honoured =20 I am unable to use "convert" located anywhere else but /usr/bin filder. = I am using the same configuration for "rabbit.handler.ImageHandler" and = executable location as I did with Rabbit3 beta. =20 =85 [rabbit.handler.ImageHandler] convert=3Dc:/Program Files/ImageMagick-6.2.6-Q16/convert.exe convertargs=3D-quality 10 -flatten $filename jpeg:$filename.c =85 =20 Please, let me know if you prefer receiving bug reports via SourceForge = bug tracking system? I am posting this to the mailing list because the = bug list is virtually empty. --=20 Best regards, Matej Mihelic =20 |
From: <Mat...@ne...> - 2006-03-30 08:02:45
|
* Rabbit 3.0 - Bug report - Uplink proxy support is not working & = ADFilter non-expected behaviour =20 - Uplink proxy support is not working =20 Rabbit seems to ignore proxy settings. I have set the proxyhost, = proxyport and proxyauth variables in the [rabbit.proxy.HttpProxy] = section but it is ignored. =20 - ADFilter non-expected behaviour =20 If you enable rabbit.filter.AdFilter and leave the configuration section = [rabbit.filter.AdFilter] without "adlinks" entry than almost everything = matches as an ad. Under without id don=92t mean an empty setting = ("adlinks=3D") but no setting at all. =20 - A bit to generic default blockURLmatching =20 Please, this is just my opinion. Since this is default for a new = installation the list should be a bit les relaxed. For a non-technical = user It might be hard to detect where the problem lies if you match to = many URLs. For instance here in Slovenia the most used local search = engine is rendered useless since a part of the URL matches the default = patterns. =20 But like I said this is just my opinion. What I propose is that you = write/create a bit more defined default set =96 for instance that it = matches just images - or that it is not enabled by default. =20 --=20 Best regards, Matej Mihelic =20 |
From: Robert O. <ro...@kh...> - 2006-03-28 20:19:01
|
Hello! I have just released version 3.0 of rabbit, the web proxy. This is a major upgrade of rabbit, it now uses java.nio for non blocking io. This means that rabbit will only use a few threads, even if you have many users. Rabbit also uses zero copy transfers when possible so system load ought to be lower. Some of the filters have been upgraded to have be regexp based instead of comma lists based. The pre-release has been stable enough for my needs for a few weeks now. But since this is a .0 release it will probably have a few bugs. Try it and tell me please. The code base is much cleaner and hopefully better. It is now possible to run several proxies in the same jvm without the need for special classloaders. The feature set is almost as good as the rabbit/2.x code, there are a few things missing (like the experimental web spider and the installer program). You will find it at: http://www.khelekore.org/rabbit/ Have fun. /robo |
From: Maryan S. <Xe...@uk...> - 2006-03-21 14:03:26
|
Hello rabbit-proxy-users, please help me when i install rabbit and type command java rabbit.proxy.proxy cmd return me a error: Exception in thread "main" java.lang.NoClassDefFoundError: rabbit.proxy.proxy i tried to add bash:CLASSPATH=.:$CLASSPATH; export CLASSPATH tcsh:setenv CLASSPATH .:$CLASSPATH windows: set CLASSPATH=.;$CLASSPATH. to file named CLASSPATH in rabbit directory....but this is dont help...... PLEASE HELP ME.... -- Best regards, Maryan mailto:Xe...@uk... |
From: Robert O. <ro...@kh...> - 2006-02-08 21:18:46
|
Rick Leir wrote: > It is good that you used Measurement Factory's Co-Advisor Test Suite. Do > they let you re-run it from time to time? Yes they do. They have been really nice about this. I have not tested rabbit 2.x in some time, but then I have not made any big changes in it either.. If you have checked any of the rabbit3 page the last month you would have noticed that I have run the tests for it, getting better over time. At the moment rabbit 3 only have 1 thing (3 test cases) where it violates the specification and I am think that I want it that way. I am not sure yet and maybe with strict http mode rabbit 3 will be correct. The case is that if rabbit get a request with a chunked or multi part resource rabbit has to provide a content length header when it passes it on unless rabbit can know that upstream is http/1.1 compliant. Knowing that a random web server is http/1.1 compliant is really not possible so what do I do? I can buffer the resource either in memory or on disk, but that will introduce latency and use prepare for some nice proxy DoS... Last full test on Co-Advisor was on 6 february, 2006. Anyway rabbit 3 has a bug in the nio handling, it will in some cases use 100% cpu (I will fix that some day). So rabbit 3 is not ready for real use, yet. But please test it and tell me how it works. About java.util.logging: I thought about it, but I am not sure that it is what I want. Note that in rabbit 3 it is very easy to implement your own log handler, perhaps I will add one logger based on your example and make it configurable. /robo |
From: Rick L. <rl...@le...> - 2006-02-08 20:52:09
|
It is good that you used Measurement Factory's Co-Advisor Test Suite. Do they let you re-run it from time to time? cheers -- Rick -- Rick Leir 613-828-8289 -> rick http://www.leirtech.com/rick/ AT leirtech DOT com |
From: Rick L. <rl...@le...> - 2006-02-08 20:01:00
|
Hi Robo, Rabbit works really well for me. Just a minor suggestion for logging: use java.util.logging instead of rabbit.util.Logger. More flexible, and can log to XML. ( Maybe it would be useful to log to a CSV file and view it as a spreadsheet in OO). I have not mapped the Rabbit log types to the Sun levels, just catted them for now. Next: it would be nice if log rotation was automatic. cheers -- Rick In Proxy, public static Logger logger; private void setupErrorLog() { String sLogDir = config.getProperty(getClass().getName(), "errorlogdir", "logs"); try{ FileHandler fh = new FileHandler(sLogDir + java.io.File.separatorChar + "error_log"); logger = Logger.getLogger("LoggingExample1"); logger.setLevel(Level.INFO); // fh.setFormatter(new XMLFormatter()); // fh.setFormatter(new SimpleFormatter()); fh.setFormatter(new SimplerTextFormatter()); logger.addHandler(fh); } catch (IOException e){ System.out.println("cannot open logfile "); } public void logError(int type, String error) { if (type < loglevel) return; String stype = getErrorLevelString(type); //zzz Date d = new Date(); // d.setTime(d.getTime() - offset); StringBuilder sb = new StringBuilder("["); //zzz synchronized (sdfMonitor) { // sb.append(sdf.format(d)); } // sb.append("]["); sb.append(stype); sb.append("]["); sb.append(error); sb.append("]"); logger.log(Level.INFO, sb.toString()); /* zzzzz synchronized (errorMonitor) { errorlog.println(sb.toString());}*/ } new class: /** * Like java.util.logging.SimpleFormatter but simpler. * @author Rick Leir LeirTech.com */ package com.leirtech.logging; import java.text.SimpleDateFormat; import java.util.Date; import java.util.logging.Formatter; import java.util.logging.LogRecord; /** * @author rleir * */ public class SimplerTextFormatter extends Formatter { /** The format we write dates on. */ private SimpleDateFormat sdf = new SimpleDateFormat( "dd/MMM/yyyy:HH:mm:ss 'GMT' "); public String format(LogRecord lr) { Date d = new Date(lr.getMillis()); String s = sdf.format(d) + lr.getLevel() + " " + lr.getMessage(); Throwable t = lr.getThrown(); if (t != null) s += t.toString(); s += "\n"; return s; } } -- Rick Leir 613-828-8289 -> rick http://www.leirtech.com/rick/ AT leirtech DOT com |
From: Robert O. <ro...@kh...> - 2006-01-29 15:31:03
|
rr...@le... wrote: > Rabbit works really well for me. Glad to hear that. > The config file line compress=false does not work. > The work-around is to comment out the individual streams such as: > #text/plain=rabbit.handler.GZIPHandler Yes and that does not work if you use FilterHandler to remove background images and such things. > Maybe I will do a patch when I figure out how the factory works. It ought to be a trivial fix. I will make sure that it works in rabbit/3.0 (pre-release). Have fun /robo |