filterproxy-devel Mailing List for FilterProxy (Page 4)
Brought to you by:
mcelrath
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
(19) |
Sep
(1) |
Oct
(5) |
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(9) |
Feb
|
Mar
(3) |
Apr
(5) |
May
(15) |
Jun
(1) |
Jul
(4) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2003 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2006 |
Jan
(1) |
Feb
(1) |
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
From: Bob M. <mce...@dr...> - 2001-08-05 04:46:51
|
John F Waymouth [way...@WP...] wrote: >=20 > On Sat, 4 Aug 2001, Bob McElrath wrote: > > If you can suggest a syntax for inside-and-add, I'll add it. It's > > pretty trivial to code. >=20 > Oh, gotcha, you're right. Can't think of anything offhand, except maybe > "regex /foo/ require encloser <bar>" or something. Yeah, I thought of that too, but it's ambiguous with: regex /foo/ inside tagblock <bar> I also thought of "require": regex /foo/ require encloser <bar> but that masks the predicates: regex /foo/ inside encloser <bar> Maybe both? regex /foo/ require inside tagblock <bar> Have to think about this a little bit more... -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: John F W. <way...@WP...> - 2001-08-05 04:39:39
|
On Sat, 4 Aug 2001, Bob McElrath wrote: > If you can suggest a syntax for inside-and-add, I'll add it. It's > pretty trivial to code. Oh, gotcha, you're right. Can't think of anything offhand, except maybe "regex /foo/ require encloser <bar>" or something. > Yup, bug 92915. It's a mozilla bug. (since it's not sending things to > the proxy...) Highly annoying. Easy to trigger by hitting reload... K, as long as this one's out of my hands... ;) Annoying, though. |
From: John F W. <way...@WP...> - 2001-08-05 04:36:57
|
Bob McElrath wrote: > > More rules will be happily accepted. ;) I know little about javascript > at this point. I have a passable knowledge of it, but you're right, we DO end up trying to fake our way around the Halting Problem in trying to figure out if a javascript block does something specific. > I just added a big red warning if you try to disable Header. In testing > it, I saw it leaving numbers at the top (content-length in hex, BTW) and > 0 at the bottom...when Header was disabled. Ah, that was what I thought you said was a Mozilla bug. Glad we have that fixed. > Ick, I really don't want to do that. Frames are evil... Which browsers > don't support javascript anyway? If you want to use something like > FilterProxy, you can enable javascript, and filter the bad stuff. ;) Good argument. > A "control page" would be better, I think. List the last several URL's > loaded, and let you manipulate them. (show source, show filtering, ...) > This one's on my TODO list. That works. > I did a web search and ran across an HTML diff tool for "only" $149.95. > I don't see anything in CPAN that would be useful. Undoubtedly someone > has written something similar using HTML::Parser, but I don't want to > add HTML::Parser to this project, it has enough dependencies as it is, > and HTML::Parser is a big one. Ok, I've done some researching. I think we want to write our own HTML prettifier (I found a few, one that didn't use HTML::Parser even, but it didn't fit our purposes). I found a package, Algorithm::Diff, which does exactly what we want: it allows traversal of a diff sequence. If you're ok with that one more single dependency, we could use this. A thought occurs to me, because you're cringing about dependencies. I don't think it should be necessary for me to have installed Compress::Zlib if I'm not using the Compress filterproxy module. I think modules should only be "use"d if they're in use. How about load a module when you come across a filter rule that uses it, except if it's already loaded (MODULES contains it)? Just a thought. > I'm gonna try to write a Rewrite-changes-highlighter this weekend. Ok. I have a few ideas in my head, so let me know if/when you've completed yours, so I'll know if I should write my own or look at yours :) > Multiple ? could be OK, as long as we strip it off before sending it on. > Another option might be to append #FilterProxy:gimmedasource to the end > of the URL (which is a valid URI). We've got not one, but two places where someone who's a stickler to the URI RFC could screw us up. Not only do we have to send a valid request to the end server, from the proxy, but we also have to make a url that won't gag any browsers. The appending idea could work decently. > Nod, pretty clever. I hadn't thought of that until you sent your > module. > > What about an .html file that had two frames, one showing some > FilterProxy config stuff, and the other the source or hightlighted > source of the page? The frame with the source could still use your > module... True. This helps let the true usefulness of a Rewrite change hiliter show through, by allowing you to react to how your rule modified the page, and change the rule accordingly. This also allows the source viewer to work standalone, to replace a browser's view-source capability, and for browsers not supporting frames. I like it. :) > Another, far less elegant solution would be to have the proxy "tell > itself" to grab the source. i.e. load 2 urls in succession: > http://localhost:8888/Source.html?getnexturlsource=true > http://wherever.com... > But then both are valid URI's... Eew. That's a race condition. > or: http://localhost:8888/Source.html?getsourceof=http://.... since you > can always encode nasties like ? in the second URL with %. The module > then just compares each URI to its internal variable $getsourceof... That could work, but kind of requires new functionality written into the core script. I don't know about you, but I think I kind of want to avoid that, it's a little inelegant. > No, CGI scripts are so ugly. I'm looking at HTML::Embperl and > HTML::Mason. If you have experience with either, I'd like to hear it... Nope, sorry. > > Ah, but the PORT is still changeable... This'll kind of obfuscate the > > source viewing URL, but at least it works with your server scheme, sorta. > > Source.pm -10 would just have to snag a request to the hostname:8887 (or > > whatever), and change it over, because the proxy would be in proxy mode. > > Hmm...I'll have to think about that... It might be better to make viewsources sent to port 23 or somesuch, which, while kind of kludgey, is CERTAIN (almost) to avoid masking a port on which resides an actual HTTP server. |
From: Bob M. <mce...@us...> - 2001-08-05 04:28:01
|
John F Waymouth [way...@wp...] wrote: >=20 > > > > SCRIPTADS: strip regex #(ads\\.freecity\\.de|flycast\\.com|/Rea= lMedia/ads/)# inside tagblock <script> add encloser <script> alternate add = balanced > > > > No, "add" will add stuff to the match if it can, but won't fail if it > > can't. If you want it to fail when stuff isn't found, use predicates > > "inside" and "containing". (I'll probably also add predicates "before" > > and "after" someday) I added a note clarifying this in the docs. > > Thanks. >=20 > Ok, so in order not to scan for <script> twice needlessly, the above could > be made more efficient by doing strip tagblock <script> containing regex > etc etc etc. Well...the time it takes for a matcher is proportional to the number of times the *first* finder matches. So by doing regex first, it will fail on most pages, and be very fast. If you do tagblock <script> first, it will match on most pages, and slow things down. Of course, the trade-off is that if it's found, you have to scan for <script> twice. Page with the ad (both ways): perl FilterProxy/Rewrite.pm adlib/marsnews.html 'strip regex /flycast\.= com/ inside tagblock <script> add encloser <script> add alternate add balan= ced' Rewrite: UNNAMED_0 took 0.05983 seconds, 0 failed, 5 successful perl FilterProxy/Rewrite.pm adlib/marsnews.html 'strip tagblock <script= > containing regex /flycast.com/ add alternate add balanced' Rewrite: UNNAMED_0 took 0.05032 seconds, 8 failed, 2 successful Page without the ad, but several <script> blocks: perl FilterProxy/Rewrite.pm adlib/sunday-times.html 'strip regex /flyca= st\.com/ inside tagblock <script> add encloser <script> add alternate add b= alanced' Rewrite: UNNAMED_0 took 0.00523 seconds, 1 failed, 0 successful perl FilterProxy/Rewrite.pm adlib/sunday-times.html 'strip tagblock <sc= ript> containing regex /flycast.com/ add alternate add balanced' Rewrite: UNNAMED_0 took 0.02345 seconds, 24 failed, 0 successful As you can see, tagblock <script> first is a factor of 4 slower on pages without the ad (which will be the majority)... If you can suggest a syntax for inside-and-add, I'll add it. It's pretty trivial to code. > Something odd i've seen in Mozilla, maybe it's a bug in Mozilla, maybe > it's a bug in the proxy... occasionally, for reasons I can't discover, > Mozilla will stop being able to use the proxy. You try to go to a site, > it says it's resolving the host like normal, but it's not firing packets > at my filter box (so say my hub lights). Seen this? Ideas? Yup, bug 92915. It's a mozilla bug. (since it's not sending things to the proxy...) Highly annoying. Easy to trigger by hitting reload... Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@us...> - 2001-08-04 23:44:42
|
John F Waymouth [way...@wp...] wrote: >=20 > It can't be that hard. Just a rule to edit out javascript.open enclosed > by <script>, unless the open call opens something in a list of exceptions. > Also kill onload events and other such, but I think links with javascript: > with popups should be allowed, because, in general, annoying content > doesn't appear from those. More rules will be happily accepted. ;) I know little about javascript at this point. > Well, I was tired last night, and this was a quick hack. I had written my > filter rule as yours is above, with -() rules for everything else, but > you're right, Header does need to be run, only Rewrite needs to be > avoided. I just added a big red warning if you try to disable Header. In testing it, I saw it leaving numbers at the top (content-length in hex, BTW) and 0 at the bottom...when Header was disabled. > > I'll put these or something similar in the docs soon. (I just recently > > figured out how to do it). What I *really* want though is something > > like "edit filtering rules applied to this site". And "view how this > > site was filtered" -- a la your view source, but with removed/rewritten > > stuff in funny colors. Someday... >=20 > It's great to have these toolbar buttons with javascript, but I see a few > problems. They don't work if javascript is disabled, and not all browsers > will support them. Maybe it'd be possible to frame EVERYTHING that runs > through the proxy, with a few nav buttons? Ick, I really don't want to do that. Frames are evil... Which browsers don't support javascript anyway? If you want to use something like FilterProxy, you can enable javascript, and filter the bad stuff. ;) A "control page" would be better, I think. List the last several URL's loaded, and let you manipulate them. (show source, show filtering, ...) This one's on my TODO list. > As far as showing the Source with Rewritten chunks in funny colors... this > is ASKING to be diff'd. I think we could use the inherent abilities of > your ordering scheme to pull this one off. All you have to do is save a > copy of the content in a data structure in the module, or in a file, > before Rewrite is called, then, run again after rewrite, and do some kind > of creative diff on it (I'm sure there's a module in CPAN for diffing... > if not, we could parse output from diff), hiliting changed parts, and > prettifying everything. I did a web search and ran across an HTML diff tool for "only" $149.95. I don't see anything in CPAN that would be useful. Undoubtedly someone has written something similar using HTML::Parser, but I don't want to add HTML::Parser to this project, it has enough dependencies as it is, and HTML::Parser is a big one. > Well, we could pass it through a beautifier, if such exists, or we could > write our own that hilites Rewrite changes, but I kind of doubt an > existent hiliter would be flexible enough to do the hiliting and signify > rewrite changes. D'oh. :) I'm gonna try to write a Rewrite-changes-highlighter this weekend. > > I can think of two other ways of doing this: > > http://your.hostname.here:8888/Source.html?http://site.for.source > > ...or... > > source:http://site.for.source/ >=20 > the first could be cool, but I'm not quite sure it will fall under URI > specs if it looks like this: > http://hostname:8888/Source.html?http://www.google.com/search?q=3Dbah >=20 > Because of the double ?. I'd have to look at the RFC again. The second > won't fly, because source:http:// will confuse browsers. I originally > tried something like source:// or wysiwyg://, but mozilla, at least, won't > let it fly. Multiple ? could be OK, as long as we strip it off before sending it on. Another option might be to append #FilterProxy:gimmedasource to the end of the URL (which is a valid URI). > The advantage, as I see it, to making a fake domain name, is the ability > to seamlessly integrate the Source module just like any other module. The > disadvantage is that it's going to mask a host named "Source", and that > the URL might not be legit. To fix the first, we could verbosify it some > more, like using a hostname "viewsourceof" or something, and to fix the > latter, maybe this falls under spec (I think so): >=20 > http://viewsourceof/http://www.google.com/search?q=3Dbah >=20 > I'm not sure I like the idea of throwing everything into a .html file on > the server end, because that sort of breaks the elegance of making it a > standard filter module. Nod, pretty clever. I hadn't thought of that until you sent your module. What about an .html file that had two frames, one showing some FilterProxy config stuff, and the other the source or hightlighted source of the page? The frame with the source could still use your module... Another, far less elegant solution would be to have the proxy "tell itself" to grab the source. i.e. load 2 urls in succession: http://localhost:8888/Source.html?getnexturlsource=3Dtrue http://wherever.com... But then both are valid URI's... or: http://localhost:8888/Source.html?getsourceof=3Dhttp://.... since you can always encode nasties like ? in the second URL with %. The module then just compares each URI to its internal variable $getsourceof... > > until I've made the move away from eperl, a task I'm not exactly looking > > forward to. >=20 > What are you planning on moving to? Another perl based solution, or > something else entirely? I'll be very afraid if you actually make them > CGI scripts, the server end is already quite massive ;) No, CGI scripts are so ugly. I'm looking at HTML::Embperl and HTML::Mason. If you have experience with either, I'd like to hear it... > > Argh. The only hostname it makes sense to have after the http:// is the > > hostname of the proxy...but that throws FilterProxy into server mode... > > Ok, well, I'll leave it as you wrote it until one of us comes up with a > > better URL scheme. ;) >=20 > Ah, but the PORT is still changeable... This'll kind of obfuscate the > source viewing URL, but at least it works with your server scheme, sorta. > Source.pm -10 would just have to snag a request to the hostname:8887 (or > whatever), and change it over, because the proxy would be in proxy mode. Hmm...I'll have to think about that... Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@us...> - 2001-08-03 20:54:53
|
John F Waymouth [way...@wp...] wrote: > (I can't reply to filterproxy-devel, I imagine, so I'll leave forwaring to > you) >=20 > > Yes, that's a bug, I'll look into it. This really should have the > > inside encloser <script>. >=20 > That's what I'm thinking. I do'nt think that any window.open in a > <script> will be benign, and that's a good thing, because I can't exactly > think of how to consistently parse the first arg for matches. I suppose > people could always exclude this rule for certain domains. Yeah, I've avoided attacking javascript so far due to things like this. I mean, to really correctly filter javascript, you have to solve the Halting problem... ;) Other filters have been very successful though. > > Looks good. You sent this just as I was trying to find a site that has > > popups. Do you know of any? >=20 > Sure. www.sparkmatch.com. I was using it as my testbed, because, if > you're unlucky, sometimes 3 or 4 windows popup per page view. This rule > doesn't remove all of the popups on this site, because it has a few nasty > tricks, like document.write()ing a <script> call to an ad server. Should > be filterable. No popups on that site for me now! (at all) In fact, that page didn't render correctly under Netscape 4.5 (some javascript error). People that use document.write() to write their HTML should be taken out and shot. But, of course, so should Nutscape 4.x... Somehow, with the popups disabled, it renders correctly though. > Oh, and don't forget, yahoo is supposedly debuting popups, hence the > slashdot article :) Not for us, it won't! ;) Thanks! BTW, some other rules should be changed to reflect the change in "inside" behavior (if you apply my patch): SCRIPTADS: strip regex #(ads\\.freecity\\.de|flycast\\.com|/RealMedia/a= ds/)# inside tagblock <script> add encloser <script> alternate add balanced BTW, have you seen the galeon/skipstone behavior WRT popups where they open a new "tab" instead of a window? An elegant solution, if you ask me. I hope mozilla adds tabs like that. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@us...> - 2001-08-03 20:13:37
|
Bob McElrath [mce...@us...] wrote: > John F Waymouth [way...@wp...] wrote: > > I didn't add an "inside encloser <script>" because that seems to replace > > the entire <script> block (bug?). > > Yes, that's a bug, I'll look into it. This really should have the > inside encloser <script>. Silly, silly bug *sigh*. Patch attached. Cheers, -- Bob Bob McElrath (rsm...@st...) Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@us...> - 2001-08-03 20:09:54
|
John F Waymouth [way...@wp...] wrote: > > > > Heh, usually when there's an ad-related article there, I post some > > blatent self promotion, but I didn't this time because I haven't written > > a anti-popup rule yet. ;) I prolly should. > > >=20 > Ok, so, I was bored, and wrote a very broad rule for popups, which not > only zaps em, but also tries to keep scripts using the window.open > function working: >=20 > rewrite regex /window\.open\(/ as eval("null //" + Clever, clever. > (note that space after the plus) >=20 > This has the following advantages: >=20 > * no need to try to find the whole window.open() call (matching parens, > ick) > * Convinces javascripts that window.open returned null > * Accepts whatever args they passed to window.open() seamlessly (the args > after the first aren't evalled, and anything in the first arg is > commented) >=20 > I didn't add an "inside encloser <script>" because that seems to replace > the entire <script> block (bug?). Yes, that's a bug, I'll look into it. This really should have the inside encloser <script>. > Lemme know what you think. Looks good. You sent this just as I was trying to find a site that has popups. Do you know of any? Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: John F W. <way...@wp...> - 2001-08-03 18:51:12
|
Ok, so, I read the slashdot article about popups, and decided I was once and for all pissed off enough about ads to do something, so I searched around for a decent html rewriting proxy for linux. Yours seems to be the best one I've found, and, BONUS!, it's for PERL! So, I checked it out, and must say I'm impressed by its out-of-the-box ability to block a ton of stuff. But I'm a website designer, and I can see a need to access the original source of a document, including content that would otherwise have been rewritten/removed. I hacked together Source.pm as a rudiimentary way to view the original source of a file. The basic gist of it is that urls targetted at the fake host "source" get snagged and their content-type is set to text/plain (yes, I'm a cheapass). To get it working, bring in the module in FilterProxy.pl, and add a filter rule for all urls looking like this: http://source/?http://foobar.com/document I allowed any amount of /'s after the source, including none, just in case. It occurs to me that maybe I should use http://source/URL to fully fall within the guidelines for a correct URI (avoid having two ?'s for GET documents). Ah well, another version. Anyway, I point the filter rule at Source, and also turn off every other filter (maybe I should leave Headers). To make usage easier, I created a bookmark (in the Mozilla Personal Toolbar) of javascript:window.open("http://source/?" + document.location);, to allow easy source viewing. For now, well, it works, sorta. I seem to get some HTTP cruft at the top and bottom, (like a 0 at the end?) and, in mozilla, the connection doesn't seem to want to close... ideas? This is a major hack. I'm mostly sending it to you to get the idea out in the open. I left the version as 0.01 on purpose ;) I'm thinking in the future that I could actually return an HTML document, escaping all of the HTML, and prettifying it with syntax hilighting using <font> tags for color. That might also avoid the problems above. However, I'm not sure I'll be able to do much more work on this in the near future, because I'm dealing with RSI problems. I'm pleased to have gotten this to work thus far :) Anyway, I hope you like my meager contribution to your project, and if you feel it's actually worth distributing, feel free to modify it and/or add it to your project, it's GPL'd. And thanks for a very useful program :) |
From: Bob M. <mce...@dr...> - 2001-07-27 16:25:52
|
?$B%a!<%k%"%s%1!<%H;vL36I?(B [sor...@ec...] wrote: >=20 > =1B$B>.@tFb3U!!;Y;}!&IT;Y;}!!6[5^%"%s%1!<%H=1B(B >=20 > =1B$B$*K;$7$$$H$3$m!"$4LBOG$r$*$+$1$7$^$9$,!"=1B(B=20 > =1B$B2<$N=1B(BURL=1B$B$r%/%j%C%/$7$F!"%"%s%1!<%H$K$46(NO$*4j$$$$$?$7$^$9!= #=1B(B=20 >=20 > http://211.9.37.210/koizumi/koizumi_an.asp?id=3D219308 English, please on this list. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: <sor...@ec...> - 2001-07-27 16:09:15
|
小泉内閣 支持・不支持 緊急アンケート お忙しいところ、ご迷惑をおかけしますが、 下のURLをクリックして、アンケートにご協力お願いいたします。 http://211.9.37.210/koizumi/koizumi_an.asp?id=219308 |
From: Bob M. <mce...@dr...> - 2001-06-08 15:16:01
|
Colin J. Wynne [cw...@av...] wrote: > Hi Bob, >=20 > I just installed FilterProxy (on the recommendation of my friend, > Danek Duvall, actually), and after some teething problems making it > all work, it's up and running. Suggestions on how to make the install easier would be appreciated. ;) > I have noticed one minor problem, though. When I run an Internet > Explorer browser through the proxy, I often find that the page loading > indicators hang just before completion, even though a View Source > shows me the whole source has come through and there is no indication > of unfinished images on the visible page. On the other hand, when I > run lynx, the page loads right up, all nice and filtered, with no > hesitations. This has been a long-standing problem with Mozilla, and is a bug with Mozilla. IE is totally untested (I don't use any M$ products...), but I'd be interested if this is a FP bug. Personally, I've tested Netscape, Mozilla, Galeon, Konqueror, and StarOffice. The only one with the page-hang problem is Mozilla, and there are bugs filed against Mozilla for it. (Note I think this is fixed in recent Mozilla nightlies) Ok, so which version of IE are you using? Which version of FP? (latest is 0.29.1) > One such has page has been spinning for the last five minutes, as I > watch. A strace of the corresponding FilterProxy process shows that > is is stuck on >=20 > select(8, [6], NULL, NULL, {1558, 750000} That basically means it's waiting for data...nothing unusual there. FilterProxy thinks it's done. > For the record, I am using a custom config. I whacked all the rules > out of the default config, and have put in just a handful of Rewrite > rules which go after <img>, <src>, <script>, and <iframe> ads. Shouldn't affect the unfinished-loading thing. If you write some good rules, please consider forwarding them to me or fil...@li... and I'll include them in later releases. ;) Someday I'll add a 'variables' facility to add things like bad-hosts, to make the .* ADS rule less hideous. (The thing is that one complicated rule is a lot faster than several simple rules...) Ok, so here's the lowdown: FP tries to keep as many connections open to the browser as possible, to speed things up. It does this by using the HTTP/1.1 Keep-Alive header. For Netscape it uses the netscape-specific Proxy-Connection header. FilterProxy will, however, close the connection if the browser tells it to do so (via the "Connection: close" header). The most likely cause of the unfinished-loading bug is a connection that is left open, that IE thinks should be closed, or improper headers sent by fp (assuming it's not an IE bug ;). Questions: 1) Is IE a HTTP/1.1 client, or is it pretending to be Netscape? (HTTP/1.0 + Proxy-Connection) Look in FilterProxy.log for: HTTP/1.0 proxy request for ... 2) Which connections are still open when IE hangs (ones for gifs? html?) -- look in log file and correlate pid's. Log looks like: [<PID> <date>] Message... In Netscape/Mozilla it's useful to hit 'Esc' in the browser (or hit stop), which forces it to stop loading the page, and close all connections. You can then look in the log to see which connections get closed, and scroll up in the log to find the last URL loaded by that pid. Also useful is to 'kill -USR1 <filterproxy's pid>', which will cause FP to dump a list of open connections to the log. 3) Is Compress turned on? 4) Is IE sending the TE: header? Turn on debug logging on the fp config page, and also turn on header-dumping (hit 'Header' for the url '.*'). Please send me relevant portions of your FilterProxy.log. > ********************************************************************** > /\ Colin J. Wynne =20 > (()) www.avtokrator.org/~cwynne/ ``Lunatic-at-Large'' =20 > /____\ cw...@av... > /______\ =20 > /________\ ``[O]nce you have done away with the ability to make > judgments as to right and wrong [...] there's no real > culture left. All that remains is clog dancing and > macrame.'' ---Neal Stephenson > ********************************************************************** Stevenson fan? I'm almost done with Snow Crash. He's quite a talented author. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <rsm...@st...> - 2001-05-08 13:40:14
|
FilterProxy 0.29.1 has just been released, with the following changes: * Fixed Rewrite bug when using 'attrib' and 'rewrite'. * Fixed Rewrite bug when using 'attrib', and attrib has no value. * Fixed Rewrite bug when using 'inside' predicate. * Minor error message clarification. As I promised earlier, I'll release 0.30 when Parse::ePerl has been replaced (not sure when I'll get to that), but I thought I should get these bug fixes out too. P.S. Is anyone on these lists besides me? Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2001-04-13 13:57:24
|
Ultimate Hacker [ha...@va...] wrote: > Hi, I'd just like to say your program looks very nice. I'm anxious to get > it working. :-) >=20 > That said, I'm having some difficulty. I've noticed you mention eperl has > had some problems on Perl5.6. Maybe that's what's happening here. >=20 > I'm trying to configure it on a linux box at work to act as a proxy for > myself going through a SOCKS firewall. I've setup the firewall in > FilterProxy to hit http://www.ourfirewall.com:80. I keep seeing the > following error in the log when I attempt to make a test connection to ht= tp://cnn.com. >=20 > [15829 Fri Apr 13 08:31:23 2001] [Perl WARNING] Use of uninitialized valu= e in pattern match (m//) at /usr/lib/perl5/site_perl/LWP/Protocol.pm line 1= 08. > [15829 Fri Apr 13 08:31:23 2001] [Perl WARNING] Use of uninitialized valu= e in concatenation (.) at /usr/lib/perl5/site_perl/LWP/Protocol.pm line 83. I can see how line 108 could fail if $scheme were empty, but line 83 doesn't have the (.) operator in it... (maybe because I have 5.50). > I'm using eperl-2.2.14 and libwww-perl-5.53. It seems the lines which are > complaining are not happy because $scheme is coming in blank or undefined. > Do you have any idea what I might do to fix this? Do this: Run 'FilterProxy -n' and capture the output (it will give a stack dump after the warnings). Turn on "dump headers to log file" in the Header module. Turn on Debug and Timing. Send me as much info as you can from doing the above. Thanks, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2001-03-22 06:44:55
|
Thue [th...@di...] wrote: > I am trying to write a rewrite rule, but having trouble with the replacement > text. This is the rule (it is a "rewrite" as opposed to "strip") > > regex /MarkG/ as lslsls > > It does remove MarkG but the "lslsls" is not inserted in place. If I delimit > "lslsls" with fx <> or "", ie > > regex /MarkG/ as <lslsls> > > it works fine. Sorry about that, I fixed it and a patch is attached (against FilterProxy/Rewrite.pm). I'll release 0.30 as soon as I can, which will have this fix (ack so busy...). -- Bob Bob McElrath (rsm...@st...) Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2001-03-22 06:43:21
|
John Stamp [st...@ci...] wrote: >=20 > I think that I may have found a problem: >=20 > I have FilterProxy set to load on startup. I dial into my ISP, then try = to=20 > connect with Konqueror. Whatever site I point to returns "500 Can't conn= ect=20 > to..." =20 >=20 > The browser will only connect to sites if I restart filterproxy after a= =20 > connection is already up. If I do that, I can disconnect and reconnect a= s=20 > much as I want without any any further need to restart the proxy. =20 >=20 > I checked one other thing. If I restart filterproxy after the connection= =20 > goes down, then the error crops up again. >=20 > I'm running the latest package of filterproxy from Debian unstable, and u= sing=20 > it on Woody. >=20 > I hope this helps. If you need any further information, please let me kn= ow. =20 > If I'm doing something absolutely stupid, I hope you'll let me know that= =20 > too. Do you have filterproxy set up to bind to 'localhost' or some other IP? ($HOSTNAME parameter in FilterProxy.pl) My guess is that filterproxy starts up and tries to bind to your IP address. When you're not dialed into your ISP, it binds to 'localhost'. Then you dial into your ISP and point your browser to something other than 'localhost'. Meanwhile, filterproxy is listening to 'localhost', and no connections come. I think the default under linux is that if your hostname is 'blah' and some network program tries to bind to IP 'blah' when no network is active (only localhost), it will end up binding to 'localhost'. This should be true for any program which accesses the network. Note that you can explicitly tell filterproxy to use 'localhost' and tell your browser to use the proxy at http://localhost:8888, and this should work between dialing-in, since 'localhost' exists whether you're dialed in or not. If you're already using localhost, let me know...and I'll come up with another hare-brained theory. ;} -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |