filterproxy-devel Mailing List for FilterProxy (Page 2)
Brought to you by:
mcelrath
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(2) |
Jul
(2) |
Aug
(19) |
Sep
(1) |
Oct
(5) |
Nov
(2) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(9) |
Feb
|
Mar
(3) |
Apr
(5) |
May
(15) |
Jun
(1) |
Jul
(4) |
Aug
(3) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2003 |
Jan
|
Feb
(1) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2006 |
Jan
(1) |
Feb
(1) |
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2008 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
From: Bob M. <mce...@dr...> - 2002-05-28 16:02:19
|
TomLynema [tl...@ka...] wrote: > Hi=20 > I am running a squid proxy here at Kalamazoo Christian High School and > was wondering if there was a way to seamlessly tie FilterProxy in with > squid. Sure, just tell FilterProxy to use the squid proxy as its upstream proxy. You can do this on the main FilterProxy config page, or by: # setenv http_proxy http://your.isp.here:1234 (csh syntax) # http_proxy=3Dhttp://your.isp.here:1234 (sh syntax) This is in the readme. Please be sure to read the WARNING at the bottom of the README file before deploying FilterProxy for use by others. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Yann D. <yd...@fr...> - 2002-05-14 11:04:01
|
Hi, I'm looking at this problem while Guillaume is away. On Tue, May 07, 2002 at 11:09:59AM -0500, Bob McElrath wrote: > > > 2) $daemon->accept is returning false or undef (line 295). > > > IO::Socket (parent of HTTP::Daemon) is supposed to return undef upon > > > 'failure'. The only failure I can see is if the port it is binding > > > to is closed, or timeout, and clearly the latter is not happening. > > > Under some circumstances (timeout) IO::Socket::accept returns an > > > error code in $@. Try printing out $@ in the "Exiting outside main > > > loop (BUG!)" message. > > > > Yes, but accept(2) can't fail for other reasons that you should test > > such as ECONNRESET. After the getpeername error, I get EPIPE, I tried to > > launch accept again, it seems to trigger endless EPIPE errors. > > This should not be possible. > > The secondary connections (the HTTP::Daemon::ClientConn object returned > by accept()) should be resettable. But the listen socket should not be > resettable. A google search on "accept ECONNRESET" told me that BIND 9 had handled a similar problem. Comments from the source (BIND 9.2.1) says: * Try to accept the new connection. If the accept fails with * EAGAIN or EINTR, simply poke the watcher to watch this socket * again. Also ignore ECONNRESET, which has been reported to * be spuriously returned on Linux 2.2.19 although it is not * a documented error for accept(). A quick search in linux-kernel and linux-net archives did no give anything useful. About errors to be "ignored", the accept(2) manpage says: ERROR HANDLING Linux accept passes already-pending network errors on the new socket as an error code from accept. This behaviour differs from other BSD socket implementations. For reli able operation the application should detect the network errors defined for the protocol after accept and treat them like EAGAIN by retrying. In case of TCP/IP these are ENETDOWN, EPROTO, ENOPROTOOPT, EHOSTDOWN, ENONET, EHOSTUN REACH, EOPNOTSUPP, and ENETUNREACH. I suppose that perl's accept() call takes care of this (or should) for portability. The EPIPE looks even more abnormal, and might be a result of perl's accept() *not* handling this poorly-documented ECONNRESET case. It's probably useful to seek confirmation of this in perl lists. -- Yann Dirson <Yan...@fr...> http://www.alcove.com/ Technical support manager Responsable de l'assistance technique Senior Free-Software Consultant Consultant senior en Logiciels Libres Debian developer (di...@de...) Développeur Debian |
From: Bob M. <mce...@dr...> - 2002-05-09 21:07:38
|
John Straw [joh...@be...] wrote: > Bob McElrath writes: > > Could you try commenting out the line: > > $agent - env_proxy(); # process http_proxy environment varia= ble > > in FilterProxy.pl and see if it fixes your POST problem? From my > > reading of LWP::UserAgent, it won't, but it should. *then* we should > > file a bug. ;) This is an easy fix though, maybe I'll submit a patch > > against LWP. >=20 > I tried the following: >=20 > 1. reset the http_proxy variable in the FilterProxy.conf file to point > at my upstream proxy, _without_ the username:password hack. > http_proxy_user and http_proxy_password are still set in the file. >=20 > 2. comment out the line in FilterProxy.pl >=20 > 3. restarted FilterProxy. >=20 > No change in behavior. The only thing that works for me (that I know > about) is the username:password hack, which gets around the fact that > LWP chokes on the 407 return when performing a POST. >=20 > (I'm confused about what effect you expect this change to have, > though, because when FilterProxy has no http_proxy environment > variable set when I start it (now)) LWP doesn't actually call the get_basic_credentials method when it encounters a proxy-authorization challenge. This is basically the bug. However if the proxy info is set via the http_proxy environment variable, it is able to get that info correctly. It kinda makes sense that it should work this way though. Otherwise how would you access a website that needed authorization, through a proxy that needed a *different* authorization? (get_basic_credentials is what is used to get the user/pass for regular 'ol non-proxy HTTP basic authorization) I can modify FilterProxy to do this automatically, and work with the data on the config page. So based on all this I have added to my TODO file: =20 * Check http_proxy environment var and ignore it if it points to me.=20 * Print out what upstream proxy is geing used on startup. * Prefer config-page entered upstream proxy/user/pass to environment=20 variable http_proxy if they differ. * When upstream proxy requires authentication, construct: http_proxy =3D http://user:pass@host:port/ and pass this to $agent->proxy() rather than using get_basic_credential= s. (HTTP authorization challenges should be passed back to browser) * Investigate HTTP/1.1 and connection cache in LWP/Protocol/http.pm do not any more remove keep-alive and other HTTP/1.1 headers? =20 > > LWP man page: > > PERL_LWP_USE_HTTP_10 > > I guess it is using HTTP/1.1 now...that's new. ;) I shall have to > > investigate further, this could speed up FilterProxy a bit. Oooohhh... > > it looks like it now can keep a connection cache and re-use connections > > to upstream servers! ;) >=20 > I thought I had seen some stuff about 1.1 in there. I'll be glad if > my fumbling leads to productive hacking! ;) Prolly not for a while. I'm sooooo busy these days... If you feel like hacking some perl, go ahead, I prolly won't get to this fo= r a while. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-05-09 19:47:47
|
John Straw [joh...@be...] wrote: > Thanks for the quick response. I've done some more digging, and > here's what I've found, so far. >=20 > Taking the problems in reverse order: >=20 > Bob McElrath writes: > > John Straw [joh...@be...] wrote: > > > > 2. Not as big a deal, but I can only start FilterProxy once per syst= em > > boot. If I stop FilterProxy for any reason and try to restart it, it [snip] > So I consider this problem to be solved. However, I don't know what > in my environment caused the fork bomb in the first place. Would you > like me to dig through the system calls when I use my full environment > to help figure out what happened? Do you have the http_proxy environment variable set in the shell from which you (re)start FilterProxy? FilterProxy obeys this variable, and would use itself as the upstream proxy! This can (and should!) cause a fork-bomb. Perhaps this would be a case worth detecting...and refusing to use itself as the upstream proxy. Thanks for the digging on LWP too. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-05-08 23:14:10
|
John Straw [joh...@be...] wrote: > Hello, >=20 > First, let me say that I really like FilterProxy. I had been using > junkbuster for several years, but I was having trouble using it with > mozilla. I switched to FilterProxy, and I haven't regretted it. It's > much more powerful, and configuration is much easier with the web > interface. Glad you like it! > That said, I have a couple of problems, and I'm wondering if you can > give me any help. >=20 > Some information about my setup: >=20 > * FilterProxy 0.30 > * sparc Solaris 8 > * perl 5.6.1 > * libwww-perl 5.64 > * HTML::MASON 1.04 > * Compress::Zlib 1.16 > * haven't installed XML, XSLT, or ImageMagik modules >=20 > 1. I've recently been forced to start using an upstream proxy which > requires authentication -- previously we had an open proxy, which I > used with no problems. When I try to submit a web form using the GET > action, everything is okay. If the form uses POST, however, I > immediatly get and error: "500 Can't read entity body: Connection > reset by peer". >=20 > This happens with all the browsers I've tried running against > FilterProxy: mozilla and Netscape 4.78 on my solaris box, IE 5.5 on my > WinNT box. The settings for http_proxy_username and > http_proxy_password appear to make no difference -- the problem occurs > if I leave them unset, as well. This is not a configuration I have tested at all. It might be useful to use ngrep and/or tcpdump on the machine running FilterProxy to determine if it is FilterProxy or the upstream proxy that is generating the error. After filtering FilterProxy hands off the request to LWP, which handles talking to your upstream proxy. To determine if it is an LWP or upstream proxy bug, use the lwp-request program to see if you can get a POST request through your upstream proxy. If POST's work with using only the upstream proxy and your browser, but not lwp-request then the problem must be in LWP... > 2. Not as big a deal, but I can only start FilterProxy once per system > boot. If I stop FilterProxy for any reason and try to restart it, it > goes crazy and forks hundreds of perl processes, sucking up all the > resources on my machine. I have to reboot before I can run it again. That is a big deal! > I've seen a reference to a "infinite fork bomb" in the TODO file -- is > this the same problem? Similar? This is absolutely reproducible for > me, and I don't understand it at all. This one? Connect to proxy as a web server (not proxied)...cases infinite fork bomb. See John Waymouth's messages. Is this fixed? I can't reproduce it anymore... I just tried it and everything seems to work fine when I connect to the proxy as a regular web server. But that TODO note doesn't sound like your problem. I have never actually seen one of these mythical fork-bomb bugs. Could you: 1) Exit/kill FilterProxy and make sure there are no FP processes running, by checking manually with ps. The being-able-to-run-it- the-first-time behavior makes me thing that it is still running... 2) run an strace when you start FilterProxy again. Use a command like this (csh syntax): strace ./FilterProxy.pl -n >& FilterProxy.strace.log the -n will keep it attached to the terminal it started from, and you should be able to kill it with ctrl-c. 3) send me the strace.log, or put it up for ftp/http if it's big or something. If Solaris doesn't have strace, see if it has something similar...(under linux strace dumps system calls to stdout with their args) > I hope that you can give me some sort of insight into these two > problems. I don't speak much perl, but I'll happily pitch in as much > as I can if you think it will help to clear things up. Thanks. ;) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Guillaume M. <gui...@mo...> - 2002-05-08 09:32:54
|
Hi Bob, Dans un message du 07 May à 13:31, Bob McElrath écrivait : > Well, if accept() is returning undef (which is what causes the main loop > to exit), then it's the listen socket that has failed. I believe the > getpeername() that is failing is not the one in FilterProxy.pl, but a > getpeername called inside the accept() method somewhere (which is buried > in perl XS). This then causes accept() to return undef, which causes > the main loop to exit. The getpeername() in FilterProxy.pl cannot cause > the main loop to exit. Nope, according to the backtraces that FilterProxy prints, the getpeername calls are the one in FiterProxy.pl. Even if I kill the child when getpeername returns undef, the following accept call will return EPIPE. I do not know why. If you ignore it and just restart accept, at some point accept will always return EPIPE. > BTW I didn't see any getpeername syscalls in the strace you sent me. It > just looked like accept returned ECONNRESET. Yes, indeed it happened sometimes. This seems to have been fixed when I upgraded to a more recent libwww-perl. > Ok, I'll add this. But I wish it fixed your problem. Same here :-) > Then it seems the logical solution would be to close the Listen socket > and start over (which you tried unsuccessfully). Perhaps close() does > not relinquish the listening socket. Try $daemon->shutdown()? From the > perlfunc man page: > > This is useful with sockets when you want to tell the other side > you're done writing but not done reading, or vice versa. It's also a > more insistent form of close because it also disables the file > descriptor in any forked copies in other processes. > > Note that for this to work consistently, FilterProxy should kill any > forked children. The shutdown() will probably forcibly close all > sockets to all clients. Hmm, I missed that method in the Perl manual. I'll try that. I've basically implemented this in a shell script. I check if there still is a listen socket, if not I kill all children with a killall and restart FilterProxy. Thanks again for your help. Regards, -- Guillaume Morin <gui...@mo...> <Overfiend> canard: thanks |
From: Bob M. <mce...@dr...> - 2002-05-07 18:32:03
|
Guillaume Morin [gui...@mo...] wrote: > Dans un message du 07 May =E0 11:09, Bob McElrath =E9crivait : > > > It is Linux 2.2.14 (tried too with .19) and glibc 2.1.3. Getpeername = can > > > return undef, if the syscall fails (here with ENOTCONN). If the peer = has > > > shutdown the connection, it looks logical to me. > >=20 > > The peer cannot shut down the connection. This is a "listen" socket, > > and should stay open when there are no connections. >=20 > I do not see how the listen socket is concerned. You call getpeername on > connected socket which are not listen anymore. Shutting down the > connection would trigger ENOTCONN, I do not see what is surprising. Well, if accept() is returning undef (which is what causes the main loop to exit), then it's the listen socket that has failed. I believe the getpeername() that is failing is not the one in FilterProxy.pl, but a getpeername called inside the accept() method somewhere (which is buried in perl XS). This then causes accept() to return undef, which causes the main loop to exit. The getpeername() in FilterProxy.pl cannot cause the main loop to exit. BTW I didn't see any getpeername syscalls in the strace you sent me. It just looked like accept returned ECONNRESET. > > > I do not know if you should ignore that one ... But there is surely a > > > list of accept(2) errors that you want to ignore. > >=20 > > Hmmm good point. I don't see how to get the linux error in perl. All > > the perl man pages just say it returns undef. > >=20 > > Do you know where to get this error code? Maybe try printing out '$!'? >=20 > yes if you use $! and compare it to the constants define in POSIX qw/:err= no_h/ > I think you should restart after at least ECONNRESET, EAGAIN, EINTR. Ok, I'll add this. But I wish it fixed your problem. > > The accept(2) man pages says to retry when it gives odd errors. Maybe > > rearrange that main loop: > > while(1) { > > my $client =3D $daemon->accept; > > next unless(defined $client); > > ... > > } > >=20 > > Maybe throw a sleep() in there so it doesn't consume 100% of the cpu if > > the network goes down (or something). >=20 > I've basically done this. But after the first EPIPE error, all sucessive > calls were EPIPE (I tried this during one minute or so). Then it seems the logical solution would be to close the Listen socket and start over (which you tried unsuccessfully). Perhaps close() does not relinquish the listening socket. Try $daemon->shutdown()? From the perlfunc man page: =20 This is useful with sockets when you want to tell the other side you're done writing but not done reading, or vice versa. It's also a more insistent form of close because it also disables the file descriptor in any forked copies in other processes. Note that for this to work consistently, FilterProxy should kill any forked children. The shutdown() will probably forcibly close all sockets to all clients. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Guillaume M. <gui...@mo...> - 2002-05-07 16:44:22
|
Dans un message du 07 May à 11:09, Bob McElrath écrivait : > > It is Linux 2.2.14 (tried too with .19) and glibc 2.1.3. Getpeername can > > return undef, if the syscall fails (here with ENOTCONN). If the peer has > > shutdown the connection, it looks logical to me. > > The peer cannot shut down the connection. This is a "listen" socket, > and should stay open when there are no connections. I do not see how the listen socket is concerned. You call getpeername on connected socket which are not listen anymore. Shutting down the connection would trigger ENOTCONN, I do not see what is surprising. > The syscalls getpeername and accept are part of glibc. I do not know what is your point. Sure, the glibc contains hooks for the syscalls but the syscalls code is definitely in the kernel. > > Yes, but accept(2) can't fail for other reasons that you should test > > such as ECONNRESET. After the getpeername error, I get EPIPE, I tried to > > launch accept again, it seems to trigger endless EPIPE errors. > > This should not be possible. I thought so, but it just happens here :-( > > I do not know if you should ignore that one ... But there is surely a > > list of accept(2) errors that you want to ignore. > > Hmmm good point. I don't see how to get the linux error in perl. All > the perl man pages just say it returns undef. > > Do you know where to get this error code? Maybe try printing out '$!'? yes if you use $! and compare it to the constants define in POSIX qw/:errno_h/ I think you should restart after at least ECONNRESET, EAGAIN, EINTR. > The accept(2) man pages says to retry when it gives odd errors. Maybe > rearrange that main loop: > while(1) { > my $client = $daemon->accept; > next unless(defined $client); > ... > } > > Maybe throw a sleep() in there so it doesn't consume 100% of the cpu if > the network goes down (or something). I've basically done this. But after the first EPIPE error, all sucessive calls were EPIPE (I tried this during one minute or so). -- Guillaume Morin <gui...@mo...> Why critize what you don't understand ? (Sepultura) |
From: Bob M. <mce...@dr...> - 2002-05-07 16:10:49
|
Guillaume Morin [gui...@mo...] wrote: > Dans un message du 07 May =E0 9:48, Bob McElrath =E9crivait : > > Well I'm stumped. And bugs I can't reproduce are very hard to debug. :( >=20 > I know, even I can reproduce it, I can't fix it :-) >=20 > > But here are a few suggestions: > > 1) If it's dying because getpeername is returning undef, why is > > getpeername returning undef? getpeername is a system call > > (/usr/include/sys/socket.h) What operating system/libc are you > > using? >=20 > It is Linux 2.2.14 (tried too with .19) and glibc 2.1.3. Getpeername can > return undef, if the syscall fails (here with ENOTCONN). If the peer has > shutdown the connection, it looks logical to me. The peer cannot shut down the connection. This is a "listen" socket, and should stay open when there are no connections. The syscalls getpeername and accept are part of glibc. > > 2) $daemon->accept is returning false or undef (line 295). > > IO::Socket (parent of HTTP::Daemon) is supposed to return undef upon > > 'failure'. The only failure I can see is if the port it is binding > > to is closed, or timeout, and clearly the latter is not happening. > > Under some circumstances (timeout) IO::Socket::accept returns an > > error code in $@. Try printing out $@ in the "Exiting outside main > > loop (BUG!)" message. >=20 > Yes, but accept(2) can't fail for other reasons that you should test > such as ECONNRESET. After the getpeername error, I get EPIPE, I tried to > launch accept again, it seems to trigger endless EPIPE errors. This should not be possible. The secondary connections (the HTTP::Daemon::ClientConn object returned by accept()) should be resettable. But the listen socket should not be resettable. > I do not know if you should ignore that one ... But there is surely a > list of accept(2) errors that you want to ignore. Hmmm good point. I don't see how to get the linux error in perl. All the perl man pages just say it returns undef. Do you know where to get this error code? Maybe try printing out '$!'? > > 3) It should be possible to work around this bug by re-initializing > > the HTTP::Daemon object. (But I'd rather fix it! ;) Add a big > > while loop around the main loop and do this at the end of it: > > $daemon =3D undef; > > $daemon =3D new HTTP::Daemon LocalAddr =3D> $HOSTNAME,=20 > > LocalPort =3D> $LISTEN_PORT, > > Reuse =3D> 1, Listen =3D> 40 > > or croak "HTTP::Daemon failed to initialize: $!\n" > > . "Is $HOSTNAME:$LISTEN_PORT correct?"; > > The latter is copied from around line 189. >=20 > Yes I'd prefer to fix it too. Unfortunately, This does not work since > the port is already bound. I tried to do a $daemon->close before that, > but it did not help. The accept(2) man pages says to retry when it gives odd errors. Maybe rearrange that main loop: while(1) { my $client =3D $daemon->accept; next unless(defined $client); ... } Maybe throw a sleep() in there so it doesn't consume 100% of the cpu if the network goes down (or something). Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Guillaume M. <gui...@mo...> - 2002-05-07 15:45:03
|
Dans un message du 07 May à 9:48, Bob McElrath écrivait : > Well I'm stumped. And bugs I can't reproduce are very hard to debug. :( I know, even I can reproduce it, I can't fix it :-) > But here are a few suggestions: > 1) If it's dying because getpeername is returning undef, why is > getpeername returning undef? getpeername is a system call > (/usr/include/sys/socket.h) What operating system/libc are you > using? It is Linux 2.2.14 (tried too with .19) and glibc 2.1.3. Getpeername can return undef, if the syscall fails (here with ENOTCONN). If the peer has shutdown the connection, it looks logical to me. > 2) $daemon->accept is returning false or undef (line 295). > IO::Socket (parent of HTTP::Daemon) is supposed to return undef upon > 'failure'. The only failure I can see is if the port it is binding > to is closed, or timeout, and clearly the latter is not happening. > Under some circumstances (timeout) IO::Socket::accept returns an > error code in $@. Try printing out $@ in the "Exiting outside main > loop (BUG!)" message. Yes, but accept(2) can't fail for other reasons that you should test such as ECONNRESET. After the getpeername error, I get EPIPE, I tried to launch accept again, it seems to trigger endless EPIPE errors. I do not know if you should ignore that one ... But there is surely a list of accept(2) errors that you want to ignore. > 3) It should be possible to work around this bug by re-initializing > the HTTP::Daemon object. (But I'd rather fix it! ;) Add a big > while loop around the main loop and do this at the end of it: > $daemon = undef; > $daemon = new HTTP::Daemon LocalAddr => $HOSTNAME, > LocalPort => $LISTEN_PORT, > Reuse => 1, Listen => 40 > or croak "HTTP::Daemon failed to initialize: $!\n" > . "Is $HOSTNAME:$LISTEN_PORT correct?"; > The latter is copied from around line 189. Yes I'd prefer to fix it too. Unfortunately, This does not work since the port is already bound. I tried to do a $daemon->close before that, but it did not help. > 4) I just noticed the 'Listen => 40' paremter above. The man page > says: > Listen Queue size for listen > try increasing it? Unfortunately, it does not change the behavior :-( > 5) The man page also says that the 'Reuse' parameter is depreciated, > in favor of 'ReuseAddr' and 'ReusePort'. Try setting those two to > one instead, and drop the 'Reuse => 1' parameter? I tried, but since Linux does not have the SO_REUSEPORT socket options. There is no changes :-( Anyway, thanks a lot for your help. Regards, -- Guillaume Morin <gui...@mo...> I am the saddest kid in grade number two (Lisa Simpsons) |
From: Bob M. <mce...@dr...> - 2002-05-07 14:48:32
|
Guillaume Morin [gui...@mo...] wrote: > Hi Bob, >=20 > I tried to upgrade perl to 5.6.1. It did not change the problem. > getpeername still returns undef in some cases and this triggers an EPIPE > in accept which kills the main loop. >=20 > If you have any other ideas, please tell me. Well I'm stumped. And bugs I can't reproduce are very hard to debug. :( But here are a few suggestions: 1) If it's dying because getpeername is returning undef, why is getpeername returning undef? getpeername is a system call (/usr/include/sys/socket.h) What operating system/libc are you using? 2) $daemon->accept is returning false or undef (line 295). IO::Socket (parent of HTTP::Daemon) is supposed to return undef upon 'failure'. The only failure I can see is if the port it is binding to is closed, or timeout, and clearly the latter is not happening. Under some circumstances (timeout) IO::Socket::accept returns an error code in $@. Try printing out $@ in the "Exiting outside main loop (BUG!)" message. 3) It should be possible to work around this bug by re-initializing the HTTP::Daemon object. (But I'd rather fix it! ;) Add a big while loop around the main loop and do this at the end of it: $daemon =3D undef; $daemon =3D new HTTP::Daemon LocalAddr =3D> $HOSTNAME,=20 LocalPort =3D> $LISTEN_PORT, Reuse =3D> 1, Listen =3D> 40 or croak "HTTP::Daemon failed to initialize: $!\n" . "Is $HOSTNAME:$LISTEN_PORT correct?"; The latter is copied from around line 189. 4) I just noticed the 'Listen =3D> 40' paremter above. The man page says: Listen Queue size for listen try increasing it? 5) The man page also says that the 'Reuse' parameter is depreciated, in favor of 'ReuseAddr' and 'ReusePort'. Try setting those two to one instead, and drop the 'Reuse =3D> 1' parameter? Good Luck, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Guillaume M. <gui...@mo...> - 2002-05-07 10:59:41
|
Hi Bob, I tried to upgrade perl to 5.6.1. It did not change the problem. getpeername still returns undef in some cases and this triggers an EPIPE in accept which kills the main loop. If you have any other ideas, please tell me. -- Guillaume Morin <gui...@mo...> People get the operating system they deserve. |
From: Guillaume M. <gui...@mo...> - 2002-05-06 12:39:44
|
Hi, I've tried to upgrade most related modules and to remove the hook on $client->connected. It still crashes but I can't see any errors in the debug log. I only get the "[ERROR] Exiting outside main loop (BUG!)" message. But no Perl errors. This is very weird. The strace shows this : connect(8, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}}, 16) = 0 send(8, "\204>\1\0\0\1\0\0\0\0\0\0\00212\0010\00216\003172\7in-"..., 42, 0) = 42 time(NULL) = 1020688418 poll([{fd=8, events=POLLIN, revents=POLLIN}], 1, 5000) = 1 recvfrom(8, "\204>\205\203\0\1\0\0\0\1\0\0\00212\0010\00216\003172\7"..., 1024, 0, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}}, [16]) = 90 close(8) = 0 fork() = 16231 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 close(7) = 0 munmap(0x401ec000, 4096) = 0 close(7) = -1 EBADF (Bad file descriptor) munmap(0x401eb000, 4096) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 accept(6, 0xbffff714, [16]) = -1 ECONNRESET (Connection reset by peer) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 time([1020688418]) = 1020688418 write(5, "[14919 Mon May 6 14:33:38 2002]"..., 74) = 74 write(1, "[14919 Mon May 6 14:33:38 2002]"..., 74) = 74 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 close(6) = 0 munmap(0x401c7000, 4096) = 0 close(6) = -1 EBADF (Bad file descriptor) munmap(0x401c6000, 4096) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 close(4) = 0 munmap(0x40194000, 4096) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 munmap(0x401c8000, 4096) = 0 munmap(0x401c5000, 4096) = 0 _exit(0) = ? The write text is the BUG message. As you can see, the problem seems to be trigerred by an unsucessful call to accept. Thanks for your help. -- Guillaume Morin <gui...@mo...> A friend in need is a friend indeed. A friend who bleeds is better. My friend confessed, she passed the test. We will never sever. (Placebo) |
From: Guillaume M. <gui...@mo...> - 2002-05-06 10:05:20
|
Dans un message du 02 May à 10:57, Bob McElrath écrivait : > This may seem unrelated...but the only thing I can think of is if you're > exhausting the available sockets or something. Somehow the socket is > getting closed forcibly. (maybe exhausting a per-user process or socket > limit?) This seems reasonable if you reload a page with lots of images > many times -- lots of FilterProxy activity. I don't think so. I've done a strace of the main processe. I saw no signal or errors related to limits. The only error was getpeername returning ENOTCONN, just as Perl reports. > Try adding this somewhere inside data_handler in FilterProxy.pl: > > if(!defined $client->connected()) { > die("Client aborted download"); > } > > This should close outgoing connections when you hit "stop". I tried this. It happens sometimes (I've added a logger call), but the main process still dies. > Do you hit > reload repeatedly, and wait for the page to finish each time? Or do you > hit reload without waiting for the page to finish? (the above patch > might help in the latter case) The crash happens when you load a page and click stop. > Can you tell me how many FilterProxy processes there are just before it > crashes? > ps aux | grep FilterProxy > and network connections? > netstat -anp | grep FilterProxy The results are very different. It varies from about 5 process or connections to 40. > And what is the version of LWP and HTTP::Daemon that you have installed? It is LWP 1.00 and libwww 5.43 (I tried to upgrade to LWP 5.47, but it did not help) > My only other suggestion is to upgrade your perl, as this could be > caused by a perl bug (and let me know if that works...). Unfortunately, I cannot do that. I must keep that version atm :-(. Thanks for your help. Regards, -- Guillaume Morin <gui...@mo...> |
From: Bob M. <mce...@dr...> - 2002-05-02 15:58:03
|
Well that's interesting... This should never happen and is probably a bug in HTTP::Daemon or IO::Socket. The $daemon->accept method should wait forever. This may seem unrelated...but the only thing I can think of is if you're exhausting the available sockets or something. Somehow the socket is getting closed forcibly. (maybe exhausting a per-user process or socket limit?) This seems reasonable if you reload a page with lots of images many times -- lots of FilterProxy activity. Try adding this somewhere inside data_handler in FilterProxy.pl: if(!defined $client->connected()) { die("Client aborted download"); } This should close outgoing connections when you hit "stop". Do you hit reload repeatedly, and wait for the page to finish each time? Or do you hit reload without waiting for the page to finish? (the above patch might help in the latter case) Can you tell me how many FilterProxy processes there are just before it crashes? ps aux | grep FilterProxy and network connections?=20 netstat -anp | grep FilterProxy And what is the version of LWP and HTTP::Daemon that you have installed? My only other suggestion is to upgrade your perl, as this could be caused by a perl bug (and let me know if that works...). Cheers, -- Bob Guillaume Morin [gui...@mo...] wrote: > Hi Bob, >=20 > First I'd like to thank for FilterProxy which is pretty cool piece of > software. I've found a bug in FilterProxy, here is a debug log : >=20 > [10389 Thu May 2 15:55:44 2002] [ERROR] Exiting outside main loop > (BUG!) > [11027 Thu May 2 15:55:44 2002] [Perl WARNING] Use of uninitialized > value at /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295. > [11027 Thu May 2 15:55:44 2002] Use of uninitialized value at > /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295. > FilterProxy::__ANON__('Use of uninitialized value at > /usr/lib/perl5/5.00503/i386-linux/...') called at > /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295 > Socket::sockaddr_in(undef) called at ./FilterProxy.pl line 534 > FilterProxy::handler('HTTP::Daemon::ClientConn=3DGLOB(0x85b56a0)') > called at ./FilterProxy.pl line 330 > [11027 Thu May 2 15:55:44 2002] [Perl ERROR] Bad arg length for > Socket::unpack_sockaddr_in, length is 0, should be 16 at > /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295. > [11027 Thu May 2 15:55:44 2002] Bad arg length for > Socket::unpack_sockaddr_in, length is 0, should be 16 at > /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295. > FilterProxy::__ANON__('Bad arg length for > Socket::unpack_sockaddr_in, length is 0, shou...') called at > /usr/lib/perl5/5.00503/i386-linux/Socket.pm line 295 > Socket::sockaddr_in(undef) called at ./FilterProxy.pl line 534 > FilterProxy::handler('HTTP::Daemon::ClientConn=3DGLOB(0x85b56a0)') > called at ./FilterProxy.pl line 330 >=20 > - There is no other mentions of 11027 in the log) > - 10389 is of course the main process >=20 > I think the problem in 11027 causes the BUG in the parent (reported > first, but I guess it is a race condition); >=20 > The problem can be triggered only with IE, you just have to load several > times a page with images. >=20 > The patch I suggest, untested unfortunately, is >=20 > 332: my($peername) =3D getpeername($client); > + next if (! $peername); >=20 > I will be able to test it tomorrow. I'll report if it works.=20 > If you think this patch is wrong and/or you have another idea, please > write to me. >=20 > TIA. Regards, >=20 > --=20 > Guillaume Morin <gui...@mo...> >=20 > Support the Debian Project (http://www.debian.org/) -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-04-29 20:38:06
|
I noticed your bug report about FilterProxy: > I just noticed that filterproxy doesn't actually seem to cope with > aborted downloads, but instead continues to merrily download away. >=20 > At this point the only way to stop the download is to actually restart > filterproxy. >=20 > This was tested on both Mozilla 1.0rc1 and Navigator 4.76, downloading > the OpenOffice binaries from > http://www.openoffice.org/dev_docs/source/build_641d/index.html#binaries, > simply by initiating a download, and shortly thereafter cancelling it. This has been in my TODO file for some time: Implement close of connection to server when client closes connection. Patches would be happily accepted. ;) It looks like the way to do this is to call die() inside sub data_handler() in FilterProxy.pl. the LWP::UserAgent man page says: =20 The request can be aborted by calling die() in the callback routine. The die message will be available as the "X-Died" special response header field. Try adding this somewhere inside data_handler in FilterProxy.pl: if(!defined $client->connected()) { die("Client aborted download"); } And let me know if it works. I'll add it to the main distro. BTW, for you Debian folks...I am now a Debian convert. ;) I converted my running RedHat system to Debian, successfully. For the rest of you, 'apt-get install filterproxy' is the coolest thing in the world. :) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-04-24 19:15:42
|
Did you edit FilterProxy.pl at all? (you shouldn't) The line it's complaining about (FilterProxy.pl:33) should read: chomp($HOSTNAME =3D `hostname`); # change this for mu= lti-homed hosts. You probably shouldn't change this line. The ImageComp error is not a problem -- the proxy should run fine but you will be unable to use the ImageComp module. Cheers, -- Bob Scott Foxman [sc...@ss...] wrote: > Hi Bob > =20 > I have a problem to start FilterProxy as following. > =20 > [filterproxy@ns-n filterproxy]$ ./FilterProxy.init start > Starting FilterProxy: Password: > Can't exec "ns-n.ss-6.com": No such file or directory at ./FilterProxy.pl= line > 33. > Use of uninitialized value in scalar chomp at ./FilterProxy.pl line 33. > Loaded module: Compress > Loaded module: DeAnim > Loaded module: Header > Module ImageComp not loaded because: > Can't locate Image/Magick.pm in @INC (@INC contains: . /usr/lib/p= erl5/ > 5.6.0/i386-l > inux /usr/lib/perl5/5.6.0 /usr/lib/perl5/site_perl/5.6.0/i386-linux /usr/= lib/ > perl5/site_pe > rl/5.6.0 /usr/lib/perl5/site_perl /home/filterproxy/FilterProxy) at /home/ > filterproxy/Filt > erProxy/ImageComp.pm line 9. > BEGIN failed--compilation aborted at /home/filterproxy/FilterProx= y/ > ImageComp.pm li > ne 9. > Compilation failed in require at (eval 16) line 1. > BEGIN failed--compilation aborted at (eval 16) line 1.Loaded modu= le: > Rewrite > Loaded module: Skeleton > Loaded module: Source > Loaded module: XSLT > Proxy on http://ns-n.ss-6.com:8888/ initialized > =20 > [filterproxy@ns-n filterproxy]$ curl http://ns-n.ss-6.com:8888/FilterProx= y.html > <HTML> > <HEAD><TITLE>An Error Occurred</TITLE></HEAD> > <BODY> > <H1>An Error Occurred</H1> > 400 URL must be absolute > </BODY> > </HTML> > [filterproxy@ns-n filterproxy]$ > =20 > Please advise me what to do!! > =20 > Thank you > =20 > Hara Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-04-24 19:13:02
|
Claudius Li [apr...@se...] wrote: > Hi Bob, > I'm thinking of using FilterProxy sort of like anonymizer. > I work at the School of Visual Arts and we need a way to let students acc= ess > certain web sites "from" school even though they are actually connecting > from home. > However we can't tell a bunch of art students to change their proxy setti= ngs > every thime they want to access these sites so I want to modify FilterPro= xy > to be accessed via a CGI. > Does this sound feasable to you? Any suggestions? Am I allowed to do this? > I ask the last because I notice a GPL in FilterProxy.pl but the web site = says > that it and the software are under your copyright. It sounds like what you want is a transparent proxy. For a transparent proxy, your students would have to be "behind" the proxy (i.e. to get to the outside world their packets would have to pass through the proxy machine) I think people have set up FilterProxy in this way before (via port redirection using ipchains/iptables). If your students are not "behind" the proxy, then the only thing you can do is to have them put in proxy settings for netscape. (otherwise, their web requests would not be routed to the computer the proxy is running on) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-04-14 15:38:04
|
The Sysop [kc...@al...] wrote: > Hi Bob, >=20 > I have an Question, I wanted to build an proxy server like you did but wi= thout > these complicated labraries and functions you included. > Therefor I looked at youre source and thougt to myself, you have much mor= e than > I need I can cut and paste and than I have a start. > Then I took a look at youre source again, and said ... Wow what a lot of = stuff > is inside there. >=20 > Is it possible for you to help me out with a start of source that only li= sten's > to the 8888 port and where I can do my own filtering ? > Just a rough start without the xslt.pm lib ?? Just remove the xslt.pm file from the FilterProxy/ directory and it won't be loaded. FilterProxy will run fine. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-04-02 21:26:00
|
Michael Ralph Pape [ral...@we...] wrote: > Hi, >=20 > I detected FilterProxy some days ago. A really great=20 > tool! Thanks a lot for it. Glad you like it! > After I installed it I played a while with it. There=20 > seems to be a bug in the http://freshmeat.net rule. > The ad on the top is displayed. > I've made NO changes to the rules in the configfile. >=20 > Would you please correct the rule? My understanding for=20 > complex regular expressions in perl is not to good. Actually it's not a rule problem. (I discovered this too experimentally) You must enable the Compress module for the freshmeat rule to work. This is because Freshmeat (and Yahoo, and several other sites) send out compressed content, and in order to do filtering, FilterProxy must first decompress it, and therefore the Compress module must be enabled. Do not worry about speed...even in compressing content sent to your browser, the Compress is v= ery fast compared to rewriting. Let me know if this does not fix your problem. I will make a note about th= is in the next release. > Do you know urls where I could find other rules for filterproxy? Unfortunately, no. Few people have sent me rules to date, and rules that I have received I have incorporated into the main release. I think this is mostly because writing a rule, and then copying out of the web-form or FilterProxy.conf file is a pain in the butt. In a future release I hope to make rule management easier, and make sharing rules easier. Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-03-25 04:43:24
|
Brian [bjm...@ui...] wrote: > I've been playing around with FilterProxy for a few hours now, and I=20 > must say, it's a great package you have here. In fact, I only have one= =20 > problem: I was tailing the log and I noticed that the "i" regex switch= =20 > throws an error. It looks like they're case-insensitive anyway, so I=20 > just removed the offending i, but I never would have noticed if I wasn't= =20 > paying attention to the logs. Did you try to make a rule like this: tag </(frame|script)/i> ? That won't work...the code that grabs out the regex is pretty dumb. It just looks for /(stuff)/ and doesn't try to grab any modifiers. The modifiers applied to FilterProxy regexes by default are "isx". (i because html is ca= se insensitive, s because the whole file is matched against, and x for obvious reasions -- see the ADS rule) Allowing modifiers wouldn't be terribly useful... (which modifier would be useful?) I'll make a note about modifiers in the Rewrite html file. > Oh, one other little problem: I created a rewrite rule without a name,= =20 > and now it doesn't show up in the menus and I can't delete it. I'm=20 > about to hack the config file by hand, wish me luck ;-) >=20 > Thanks for the great code! You're welcome! :) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-03-10 21:16:11
|
Adam Duck [du...@in...] wrote: > >>>>> "Bob" =3D=3D Bob McElrath <mce...@dr...ysics.wisc.= edu> writes: >=20 > Bob> http://address:port should work. >=20 > Bob> $ua->env_proxy() for some examples. I'm not sure why my TODO > Bob> file says it's broken, bu tlet me know if you get it to work! > Bob> ;) >=20 > Oh yes, now it works like a charm. I thought I'd enter the proxy like > in galeon, where I don't need the "http://"-part. Thx for the advice. > Perhaps you could mention ist somewhere in README or INSTALL? This is in the README, and I have just added a note on the config page about how the upstream proxy URL should be entered. > And: your TODO-file didn't say it was broken. You misunderstood me > here, I think. No, it *does* say :Using an upstream proxy is apparently broken... :( in my TODO file. But that might have been added since 0.30... > Bob> BTW the unitialized value ... Mason... ({}) you're seeing in > Bob> your logs is a Mason bug, but shouldn't affect anything. >=20 > Yes, I know this now, because I read the TODO-file ;-). I was just not > knowing why FilterProxy stopped to work after the second startup. >=20 > Thx for this fast reply and this wonderfull program. I hope that the > next thing to be done is this "just parsing the file once"-thingy ;-). I don't have much time to work on it these days, but my personal priority l= ist looks like: 1) Fix outstanding bugs (new kinds of ads, mangled pages, etc) 2) Split config file into several files so that a) rules can be shared easily, and b) upgrades are more painless. 3) Everything else... ;) Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-03-04 14:57:10
|
Tomasz Rzad [to...@rz...] wrote: > On Fri, Feb 15, 2002 at 11:11:13AM -0600, Bob McElrath wrote: > > Tomasz Rzad [to...@rz...] wrote: > > > Bob, > > >=20 > > > Is there any way to use FilterProxy in Transparent Proxy Mode? I alwa= ys get > > > "400 URL must be absolute". > >=20 > > I think I have reports that people have set it up this way before. If = you're > > getting that error, FilterProxy thinks it's in server-mode. Are you re= questing > > a URL from filterproxy itself? (i.e. http://host.here:8888/FilterProxy= .html) > > Does it work if you just request some other URL? (not a FilterProxy se= rved > > config page) > >=20 > > Can you tell me how it is set up so far? This would be useful info to = add to > > the README file. >=20 > Do you have any idea how to run service I wrote you before? >=20 > Let me explain it to you again: >=20 > I would like to redirect all request to FilterProxy by doing "ipchains ..= . -j REDIRECT 8888" on my Linux box. > Every time I do that I receive "URL must be absolute" (in my opinion this= comes from UserAgent.pm). > Can you help me at thie matter? I don't know how to do this. But I think I might know why it doesn't work. Your browser thinks it is going to site www.abcd.com and sends a request li= ke this: GET /index.html your ipchains redirects this to FilterProxy, which sees the non-absolute URL /index.html. FilterProxy can't figure out what site you were trying to get because there isn't enough information in the request. Can you turn on "dump headers to log file" on the Header config page, and enable debug on the main page, and send me some of your logs? Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-01-17 18:07:13
|
I noticed your bug report to debian. You should send such things to me too= , as this affects the whole package. Having never received any Rewrite rules from anybody, I've been unsure if anyone was writing any! If you have some interesting rules, please consider sharing them. ;) How about the following: I could write a little script 'confmerge' that would take the user's old configuration file and merge it with a new one. The user's config file wou= ld take priority where there are conflicts. This is simple *unless* you want = to use the rule updates that I write. What would you suggest for merging Rewrite rules? diffing of regular expressions is hardly a straightforward matter. In this release I've taken= the really big rule (ADS) and used the m//x construct to put whitespace in it, spread it over several lines, so looking at a diff of it could be useful, b= ut in general comparing regexes is very hard to do by eye. So let's say for instance that you have modified the rules for the andover.= net set of sites (the long one that includes slashdot). In this release I have= the following rules: 1_ANDOVER_ADLOG 1_GREENDOT 1_OSDN 1_ANDOVER If one of these is modified in the user's old config file, should 'confmerg= e' choose the user's old one? Give the user a chance to edit the rule? Choose the new one? Keep in mind that for most users, most of the rules in the new release will have been modified by me, and they will have never seen the ru= le before. Asking them to decide which one to keep is probably not reasonable. I suppose it would be possible for every new release to keep a copy of the previous release's config file for comparison (and could then automatically detect if the rule was modified by the user since the last release). But t= his would make non-incremental upgrades a bitch. What about if the user has deleted one rule (say 1_ANDOVER_ADLOG) from a si= te. Should 'confmerge' add it back in? Note your suggestion of separate conf files doesn't really fix this because this issue of merging rules is non-trivial. =20 > filename.def is the default config file to be modified by dpkg only, > filename.rul the rule filename for the users override file. I don't really want to make a distinction between "my rules" and the user's rules. My rules are not in principle better than ones someone else would write... And it doesn't solve the diffing problem. Please, how could I make this better... Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |
From: Bob M. <mce...@dr...> - 2002-01-16 17:31:52
|
Mario Lang [la...@zi...] wrote: > Hi. >=20 > As colour highlighted markup isnt the best method for me, I hacked the sh= owfiltering > part of Rewrite.pm a bit to optionally produce either diff -u or diff -c > output... Here is the patch against FilterProxy/Rewrite.pm. I wanted > to send it to you first, and leave the interface changes also up to you i= f you > want to include it... >=20 > Basicly, if you call showfiltering with an additional parameter &diff=3D-= u, > you get the difference between the original document and the rewritten on= e. Looks good to me. Go ahead and check this into CVS. You should change: if(defined $cgi->param('diff') && $cgi->param('diff')) { #... So that perl doesn't generate an "uninitialized value" warning at runtime. I used a CSS stylesheet for the colored marked-up pages. I used=20 <span class=3D"rulename">=20 tags to make the colored sections. Would there be a way of changing the stylesheet instead to make reading this easier for you? > Maybe we can even generalize this further to make it work with XSLT too? The diff would be easy for xslt, go ahead and write it if you want it. I've been slowly reading a few sites about XSLT. There's a pretty good rant abo= ut it on http://www.kuro5hin.org. The color highlighted markup would be very difficult for xslt. As it was, keeping track of which pieces of the document were modified in Rewrite was = much more difficult than I expected. With XSLT, all the work it does is hidden = in the libxslt library. So a 'diff' is about the only thing you can do. I don't like diff though because many web pages have very long lines (gener= ated by a script), so looking at a diff shows you the two 500 char long lines, a= nd it's a bit of work to scan through that line and figure out what was change= d. :( Probably better than nothing though! The diff stuff could be put in FilterProxy.html instead, that way you could= see all the changes at once. But then it would be difficult to tell what chang= ed what. The Rewrite markup gives you the rule name. How would you figure out which xslt stylesheet changed what? > Index: FilterProxy/Rewrite.pm > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /cvsroot/filterproxy/FilterProxy/FilterProxy/Rewrite.pm,v > retrieving revision 0.33 > diff -u -r0.33 Rewrite.pm > --- FilterProxy/Rewrite.pm 2002/01/12 22:43:54 0.33 > +++ FilterProxy/Rewrite.pm 2002/01/15 19:25:40 > @@ -24,6 +24,9 @@ > use vars qw($VERSION $CONFIG); > use Time::HiRes; > use URI::Escape; > +use Fcntl; > +use POSIX qw(tmpnam); > + > $VERSION =3D 0.30; > push @FilterProxy::MODULES, "Rewrite"; > $FilterProxy::Rewrite::CONFIG =3D {}; > @@ -1139,6 +1142,24 @@ > logger(ERROR, "View unfiltered source returned HTTP code: " = . $res->code); > return("HTTP code: " . $res->code); > } > + if ($cgi->param('diff')) { > + my $tmpnm1 =3D tmpnam(); > + my $tmpnm2 =3D tmpnam(); > + sysopen(ORIG, $tmpnm1, O_RDWR|O_CREAT|O_EXCL); > + sysopen(NEW, $tmpnm2, O_RDWR|O_CREAT|O_EXCL); > + print ORIG $origdoc; > + print NEW ${$res->content_ref}; > + close ORIG; close NEW; > + my $prm =3D $cgi->param('diff'); > + if ($prm!~/^-[cu]$/) { > + return "Cracker attack..."; > + } > + my $diff=3D`/usr/bin/diff $prm $tmpnm1 $tmpnm2`; > + unlink($tmpnm1);unlink($tmpnm2); > + $diff=3D~ s/</</g; > + $diff=3D~ s/>/>/g; > + return $diff; > + } else {=20 > $res->content($origdoc); # We don't care what the changed docum= ent looks like, we got > my($doc) =3D ""; # everything we need in @markup > my($markup); > @@ -1193,6 +1214,7 @@ > } > } > return $message; # message printed to user > +} > } > =20 > # Some code so this can be run at the command line -- for filtering algo= rithm tests. >=20 Cheers, -- Bob Bob McElrath (rsm...@st...)=20 Univ. of Wisconsin at Madison, Department of Physics |