[Rabbit-proxy-development] 060403: Pre-3.1 imressions & ideas for the future releases
Brought to you by:
ernimril
|
From: <mat...@ne...> - 2006-04-03 14:07:22
|
* 060403: Pre-3.1 impressions & ideas for the future releases
I have tested the pre-3.1 version and it did fix most of my problems. It
fixed all the reported ones.
There is one more problem regarding Firefox and incomplete pages in
connection with time-outs. It looks like sometimes the connection will
simply hang and then an incomplete page is returned and cached by the
Firefox. I don't exactly know what is happening but I've not seen such
behaviour with any other proxy and I have a few experiences.
There is a condition that you may want to handle gracefully. Some
webmasters use and send "-1" as a value for "Expires" header. I know
this is a violation of the RFC but it is common. You should handle it
gracefully. It means don't cache or already expired.
As for my suggestions for the future I would like to suggest the
following enhancements that would improve user experience in the
following two regions:
* Ads are really a bandwidth hogs and are annoying.
* The proliferation of satellite and mobile (3G GSM - EDGE, UMTS)
connections with very high latency (500ms+) requires a
minimisation of requests sent from client to server.
If you would implement the ad blocking and URL blocking in a way that
would enable user to use a pre-prepared block lists you would really
help them. The most popular ad-block list would be the one for the
ADBlock Firefox extension - G-Filter sets (
http://www.pierceive.com/filtersetg/ ). They are formed in two separate
lists: black list and white list. By implementing the import from them
in the Rabbit you would really help the users.
My suggestions for new ad-blocking features would be the following:
* Implement white-list and black-list definitions for the
"rabbit.filter.BlockFilter" as well as for the
"rabbit.filter.AdFilter" ("blockURLmatching" and
"DontBlockURLmatching"). This would allow for more a more relaxed
filtering with optional white-listing of certain sites.
* Besides current format for filters allow for read patterns from
file in the G-Filter lists format ("blockURLmatchingFile" and
"DontBlockURLmatchingFile"). You should probably convert the
patterns to common format and merge them with the ones from
"blockURLmatching"/"DontBlockURLmatching".
* Enhance "rabbit.filter.BlockFilter" to blocks HTTPS URLs as well.
I have explained in a previous message why I find this important.
My suggestions for high-latency links accelerations are the following:
* When possible embed external files into the HTML using the
RFC-2397 data URI scheme (IMG tag, SCRIPT, STYLE tag - you fetch
the file from SRC/HREF and replace it) . References: (
http://en.wikipedia.org/wiki/Data:_URL,
http://www.mozilla.org/quality/networking/docs/aboutdata.html ).
I know that this is currently only supported through Mozilla and
Opera browsers but it would probably help tremendously on
high-latency link. There is a way to get partial RFC-2397 support
in the IE through protocol handler but it will be limited by the
URL connection limit in IE. I've put you a copy of the IE plugin
on my server "http://neosys.si/users/Matej/DataProtocol.zip".
Examples:
- http://neosys.si/users/matej/rabbit/Data_SiOL.net.htm
(Opera and Netscape)
- http://neosys.si/users/matej/rabbit/Data_IE_SiOL.net.htm
(IE - doesn't support data URIs GIF?!?)
- http://www.scalora.org/projects/uriencoder/
(original: http://neosys.si/users/matej/rabbit/SiOL.net.htm )
I know that data URI is in general limited to 1024-4096 bytes
(Mozilla unlimited) and that it would actually increase the file
size and disable the caching effect. This is against current goals
but I see the following arguments:
o High-latency links have in general a high throughput.
o Due to reduced size in JPEG re-compression the files could
still be smaller.
o Limitation of 4096 bytes would - due to JPEG file reduction
- suffice for most sites. But with Firefox this limitation
does not exist.
o Caching is not so important for pages where you don't browse
around the site and there are many new images anyway (news
sites).
I would suggest the following configuration variables:
- enableDataURIforTags=IMG|STYLE|SCRIPT
- enableDataURIforObjectsWithExtension=JPG|JPEG|CSS|GIF|JS
- maximumSizeForDataURI=16384 ; Firefox can take it.
- dontEmbeddDataURIforSites=
* I would also replace HREFs that are pointing to
(adfiler/blockfilter) blocked URLs with one of the following ones:
o HREF to a fixed error page at the RABBIT server. This
would allow for caching of the response.
o HREF=data:,Rabbit%20denied%20this%20page
This would remove the need for round trip to server for
the 403 message. Unfortuantely it would mask the
destination URL. But since the user can request
unfiltered page he can still find it.
I know that there would be another option - multipart encoding. But
I have no idea how well this is supported accros browsers.
Then there is one last proposal. You could implement SSL filtering as
well. Proxomitron is a great example how it could be done. It users
temporary SSL key between client and proxy and temporary or predefined
SSL certificates when communicating with remote servers.
--
Best regards,
Matej.
|