[Rabbit-proxy-development] 060403: Pre-3.1 imressions & ideas for the future releases
Brought to you by:
ernimril
From: <mat...@ne...> - 2006-04-03 14:07:22
|
* 060403: Pre-3.1 impressions & ideas for the future releases I have tested the pre-3.1 version and it did fix most of my problems. It fixed all the reported ones. There is one more problem regarding Firefox and incomplete pages in connection with time-outs. It looks like sometimes the connection will simply hang and then an incomplete page is returned and cached by the Firefox. I don't exactly know what is happening but I've not seen such behaviour with any other proxy and I have a few experiences. There is a condition that you may want to handle gracefully. Some webmasters use and send "-1" as a value for "Expires" header. I know this is a violation of the RFC but it is common. You should handle it gracefully. It means don't cache or already expired. As for my suggestions for the future I would like to suggest the following enhancements that would improve user experience in the following two regions: * Ads are really a bandwidth hogs and are annoying. * The proliferation of satellite and mobile (3G GSM - EDGE, UMTS) connections with very high latency (500ms+) requires a minimisation of requests sent from client to server. If you would implement the ad blocking and URL blocking in a way that would enable user to use a pre-prepared block lists you would really help them. The most popular ad-block list would be the one for the ADBlock Firefox extension - G-Filter sets ( http://www.pierceive.com/filtersetg/ ). They are formed in two separate lists: black list and white list. By implementing the import from them in the Rabbit you would really help the users. My suggestions for new ad-blocking features would be the following: * Implement white-list and black-list definitions for the "rabbit.filter.BlockFilter" as well as for the "rabbit.filter.AdFilter" ("blockURLmatching" and "DontBlockURLmatching"). This would allow for more a more relaxed filtering with optional white-listing of certain sites. * Besides current format for filters allow for read patterns from file in the G-Filter lists format ("blockURLmatchingFile" and "DontBlockURLmatchingFile"). You should probably convert the patterns to common format and merge them with the ones from "blockURLmatching"/"DontBlockURLmatching". * Enhance "rabbit.filter.BlockFilter" to blocks HTTPS URLs as well. I have explained in a previous message why I find this important. My suggestions for high-latency links accelerations are the following: * When possible embed external files into the HTML using the RFC-2397 data URI scheme (IMG tag, SCRIPT, STYLE tag - you fetch the file from SRC/HREF and replace it) . References: ( http://en.wikipedia.org/wiki/Data:_URL, http://www.mozilla.org/quality/networking/docs/aboutdata.html ). I know that this is currently only supported through Mozilla and Opera browsers but it would probably help tremendously on high-latency link. There is a way to get partial RFC-2397 support in the IE through protocol handler but it will be limited by the URL connection limit in IE. I've put you a copy of the IE plugin on my server "http://neosys.si/users/Matej/DataProtocol.zip". Examples: - http://neosys.si/users/matej/rabbit/Data_SiOL.net.htm (Opera and Netscape) - http://neosys.si/users/matej/rabbit/Data_IE_SiOL.net.htm (IE - doesn't support data URIs GIF?!?) - http://www.scalora.org/projects/uriencoder/ (original: http://neosys.si/users/matej/rabbit/SiOL.net.htm ) I know that data URI is in general limited to 1024-4096 bytes (Mozilla unlimited) and that it would actually increase the file size and disable the caching effect. This is against current goals but I see the following arguments: o High-latency links have in general a high throughput. o Due to reduced size in JPEG re-compression the files could still be smaller. o Limitation of 4096 bytes would - due to JPEG file reduction - suffice for most sites. But with Firefox this limitation does not exist. o Caching is not so important for pages where you don't browse around the site and there are many new images anyway (news sites). I would suggest the following configuration variables: - enableDataURIforTags=IMG|STYLE|SCRIPT - enableDataURIforObjectsWithExtension=JPG|JPEG|CSS|GIF|JS - maximumSizeForDataURI=16384 ; Firefox can take it. - dontEmbeddDataURIforSites= * I would also replace HREFs that are pointing to (adfiler/blockfilter) blocked URLs with one of the following ones: o HREF to a fixed error page at the RABBIT server. This would allow for caching of the response. o HREF=data:,Rabbit%20denied%20this%20page This would remove the need for round trip to server for the 403 message. Unfortuantely it would mask the destination URL. But since the user can request unfiltered page he can still find it. I know that there would be another option - multipart encoding. But I have no idea how well this is supported accros browsers. Then there is one last proposal. You could implement SSL filtering as well. Proxomitron is a great example how it could be done. It users temporary SSL key between client and proxy and temporary or predefined SSL certificates when communicating with remote servers. -- Best regards, Matej. |