Re: [Filterproxy-devel] Re: New module for FilterProxy
Brought to you by:
mcelrath
From: Bob M. <mce...@dr...> - 2001-08-23 01:01:32
|
John F Waymouth [way...@WP...] wrote: > Hey, sorry I haven't responded in awhile, I've been pretty busy. Me too. Just got back from California. I've half-written the Rewrite markup thing...I'll release a new version as soon as it works. > On Sun, 5 Aug 2001, Bob McElrath wrote: > > Well, it turned out to be pretty easy. Two files, maybe 15 lines extra > > total. (Attached -- to use it, add $agent to the "use vars" list at the > > beginning of FilterProxy.pl, and change "my $agent" to be just "$agent" > > on line 171). And escaping in javascript turned out to be trivial, it's > > a function called "escape"... heh. >=20 > Hmm, this is decent, but there's the problem of not running the request > through Header.pm, or anything else the user might want. Maybe you should > hook into the handler function in FilterProxy.pl? This, of course, yet > again brings up the question of how to hook in Source.pm, but we can throw > in a bogus header or even make the request have a source:// instead of > http://, cause we're dealing with the request internally. Then we get the > best of both worlds. Sorta. Yes, I think I fixed that by explicitly calling FilterProxy::handle_filtering(-10,1,2,3), which then calls Header (and other modules, if necessary). Note it does not call handle_filtering for Orders that modify the content. (See comments at the beginning of Skeleton.pm) > BTW, to avoid using URI::Escape (and having yet another dependency, unless > that's standard?) you can use CGI::unescape for URL encoding, and > CGI::escapeHTML to do all your > and such. Don't forget to html-ify > tabs and newlines, though, and may as well get spaces as well. CGI is already a dependency...and is included in LWP, which is a dependency > > This method also opens the door wide up to marking up or reformatting > > the source. This bit of javascript, when bookmarked, will act like > > "view source". >=20 > It will take a fair amount of trickery and interoperation between Rewrite > and Source to build up the diff list. Perhaps we should be really darn > sneaky (cheat) and store more data in the $res hash. It's just a hash, > after all. Will perl allow us to mess with someone else's blessed hash? >=20 > Otherwise, I suppose we can hook Source.pm in a third time, at level 1, to > put in an internal header that Rewrite recognizes, which tells it to build > up differences. Or something like that. Well, what I've done is set a flag ($markupinstead) which tells Rewrite to build a @markup data structure instead of modifying the source. Then I call FilterProxy::handle_filtering for Rewrite's Order. I then parse this data structure, marking up with the name of the rule as I parse it. Both the flag and the data structure are variables in the FilterProxy::Rewrite namespace. This isn't a race condition since it is all executed by a single FilterProxy child process, which resets the flag when it's done. Ugly, but it works. It turns out that the really hard part is when there is overlapping modifications. (which is pretty common, actually) Marking up nonoverlapping ones was easy, and could be done in one pass. The two pass method described above is necessary in case two matches overlap. (Matches can grow backwards, and would grow over a previously marked up section!) It's complicated by the fact that Rewrite also has to parse the data structure to make sure the piece it's examining hasn't already been "stripped". Ugh! > > There is, it is implementation dependent, 4096 bytes is most common, I > > think, but I've seen people complain about implementations that use 1024 > > bytes. >=20 > Alright. We'll have to see how well this works; remember that long query > strings will be even more elongated because every % becomes a %25. Well, for the time being I'll keep both methods. So it will still be possible to do http://source/... Since the long URL is browser->proxy, only a browser limitation would cause a problem. HTTP::Daemon, which parses the headers for FilterProxy, has a limitation of 16k for the URI, so we should be ok. Neither rfc 2068 or 2616 specifies how long URI's or headers can be, but both specify the 413 and 414 error codes for headers/URI's that are too long. > > I know this isn't exactly what you wanted John, but take a look at the > > attached files and let me know what you think. >=20 > I suppose it'll work. I hadn't thought of hooking into Config, I thought > you were planning to write everything in the embedded perl, which wouldn't > be too happy. It's your proxy, it's your choice. We'll see how it works. Ack not in embedded perl...that would be ugly. ;) > > P.S. I added a workaround for the Mozilla reload-hang. 0.29.2 "Real > > Soon Now". >=20 > Maybe you could send a prerelease my way? ;) The fix has also gone into the mozilla trunk. Maybe easier to get a new nightly since I'm deathly slow... ;) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |