Re: [Filterproxy-devel] Re: New module for FilterProxy

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

John F Waymouth [way...@WP...] wrote:
>=20
> Ok, I've done some researching.  I think we want to write our own HTML
> prettifier (I found a few, one that didn't use HTML::Parser even, but it
> didn't fit our purposes).  I found a package, Algorithm::Diff, which does
> exactly what we want: it allows traversal of a diff sequence.  If you're
> ok with that one more single dependency, we could use this.

That's fine (see below on eval).  I want to go further than a diff
though (for instance, also insert the name of the rule that made the
change).

> A thought occurs to me, because you're cringing about dependencies.  I
> don't think it should be necessary for me to have installed Compress::Zlib
> if I'm not using the Compress filterproxy module.  I think modules should
> only be "use"d if they're in use.  How about load a module when you come
> across a filter rule that uses it, except if it's already loaded (MODULES
> contains it)?  Just a thought.

Done.  Now it loops over files in FilterProxy/, eval's them, and keeps
going if the eval fails.  Looking for filter rules would be a bit
harder, since you have to know the module exists, before you load it...
(i.e. Compress isn't used now but you want to add it for some site...)

This also means that modules aren't hard-coded into FilterProxy.pl
anymore.

> > I'm gonna try to write a Rewrite-changes-highlighter this weekend.
>=20
> Ok.  I have a few ideas in my head, so let me know if/when you've
> completed yours, so I'll know if I should write my own or look at yours :)

Ok, so I started to outline how this would work, and I just can't get
around the fact that we need to send *extra* data with the request.
Whether we want to return the source or return highlighted source,
framed or not, it's the same...the behavior isn't default, and we have
to signal that.  And it must work for ANY url.

Is there any other way to get data from the browser to the proxy than in
the URI?  Can browsers handle a cookie for the proxy, or a cookie for
all domains?

If the answer is no...the only *proper* way to do it is to write a whole
lot of code along the lines of http://localhost:8888/Source.html?get=3D%%%

Checking Netscape and Mozilla, It looks like multiple ? aren't a
problem, but with multiple # the last one is stripped off.  (These
aren't intended for servers anyway)  But adding the data at the end of
the URI after a ? could easily run into the maximum URI-length problem.

> > Another, far less elegant solution would be to have the proxy "tell
> > itself" to grab the source.  i.e. load 2 urls in succession:
> >     http://localhost:8888/Source.html?getnexturlsource=3Dtrue
> >     http://wherever.com...
> > But then both are valid URI's...
>=20
> Eew.  That's a race condition.

Yep.  Baaad.

> > or: http://localhost:8888/Source.html?getsourceof=3Dhttp://.... since y=
ou
> > can always encode nasties like ? in the second URL with %.  The module
> > then just compares each URI to its internal variable $getsourceof...
>=20
> That could work, but kind of requires new functionality written into the
> core script.  I don't know about you, but I think I kind of want to avoid
> that, it's a little inelegant.

Arg.  Yeah, but editing the config will require interaction with the
core script too...

This mechanism requires a whole lot of code to be written...  Basically
duplicating &handle_proxy, except storing the data, rather than feeding
it to the client.  Ick.

Cheers,
-- Bob

Bob McElrath (rsm...@st...)=20
Univ. of Wisconsin at Madison, Department of Physics