From: Luis D. L. Q. <lui...@gm...> - 2009-07-16 06:36:36
|
Hi all, After working on fine tunning anti-spam, I've realize that statistics are primordial to identify high-fuzzy spam. Because of that, we are interested on being a mirror of public.pyzor.org. Is here anyone that could guide on this? kind regards, LD |
From: Dreas v. D. <dr...@sp...> - 2009-07-16 08:14:37
|
Luis Daniel Lucio Quiroz wrote: > Hi all, > > After working on fine tunning anti-spam, I've realize that statistics are > primordial to identify high-fuzzy spam. Because of that, we are interested on > being a mirror of public.pyzor.org. > > Is here anyone that could guide on this? > Hi! You are able to run a local Pyzor server with your own data, but the public server data is currently not available for mirroring. Kind regards, Dreas |
From: Andreas S. <sch...@fa...> - 2009-07-16 08:45:22
|
On Thu, 16 Jul 2009, at 09:22, Dreas van Donselaar wrote: > Luis Daniel Lucio Quiroz wrote: > > we are interested on being a mirror of public.pyzor.org. > You are able to run a local Pyzor server with your own data, but the > public server data is currently not available for mirroring. Though nobody keeps you from making your local server public. And I'd certainly welcome a secondary public pyzor server. HTH, -- -- Andreas |
From: Tony M. <to...@sp...> - 2009-07-16 08:43:34
|
> After working on fine tunning anti-spam, I've realize that statistics are > primordial to identify high-fuzzy spam. Because of that, we are interested on > being a mirror of public.pyzor.org. Is there anything about the existing public.pyzor.org that makes it unsuitable for use? i.e. is there anything that could be improved that would mean that you didn't see the need to have your own mirror? Cheers, Tony |
From: Andreas S. <sch...@fa...> - 2009-07-16 09:00:27
|
On Thu, 16 Jul 2009, at 20:43, Tony Meyer wrote: > Is there anything about the existing public.pyzor.org that makes it > unsuitable for use? i.e. is there anything that could be improved > that would mean that you didn't see the need to have your own mirror? My answer is no. Indeed, no matter what I think about it it's no because until today I managed to run my mail servers without a local pyzor server (though I did think about installing one quite often ;) during Pyzor's more troubled times of past years). Nevertheless, we do see occasional timeouts which I think a disrupting the whole idea of Pyzor: $ grep -ci "reporting to pyzor services" $LOGSPAMJULY 2409 $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY 226 That's a 9 % rate in July. Is it just me here in Vienna, Austria? -- -- Andreas Re-Alpine: https://sourceforge.net/projects/re-alpine/ Reborn Alpine continues UW's Alpine/Pine email client |
From: Dreas v. D. <dr...@sp...> - 2009-07-16 09:48:15
|
Andreas Schamanek wrote: > Nevertheless, we do see occasional timeouts which I think a > disrupting the whole idea of Pyzor: > > $ grep -ci "reporting to pyzor services" $LOGSPAMJULY > 2409 > $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY > 226 > > That's a 9 % rate in July. > Is it just me here in Vienna, Austria? > That's not good. Can you contact me off list so we can try and figure out where these timeouts come from and how we can resolve it? Thanks :) Dreas |
From: Luis D. L. Q. <lui...@gm...> - 2009-07-16 14:54:35
|
Le jeudi 16 juillet 2009 04:00:15, Andreas Schamanek a écrit : > On Thu, 16 Jul 2009, at 20:43, Tony Meyer wrote: > > Is there anything about the existing public.pyzor.org that makes it > > unsuitable for use? i.e. is there anything that could be improved > > that would mean that you didn't see the need to have your own mirror? > > My answer is no. Indeed, no matter what I think about it it's no > because until today I managed to run my mail servers without a local > pyzor server (though I did think about installing one quite often ;) > during Pyzor's more troubled times of past years). > > Nevertheless, we do see occasional timeouts which I think a > disrupting the whole idea of Pyzor: > > $ grep -ci "reporting to pyzor services" $LOGSPAMJULY > 2409 > $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY > 226 > > That's a 9 % rate in July. > Is it just me here in Vienna, Austria? Having another mirror wouldnt reduce that 9% to 4.5% because DNS RR? |
From: Dreas v. D. <dr...@sp...> - 2009-07-16 15:07:31
|
Luis Daniel Lucio Quiroz wrote: > Having another mirror wouldnt reduce that 9% to 4.5% because DNS RR? > Yes that would happen if we would keep the server with a seemingly high failure percentage, and add a perfect one without failure. Therefore I rather solve the issue at the source so this becomes close to 0% :) We will add some monitors to try and determine the cause. Please contact me offlist when you experience issues so we can troubleshoot things. Regards, Dreas |
From: Robert H. L. <la...@la...> - 2009-07-16 09:51:30
|
Andreas Schamanek wrote: > On Thu, 16 Jul 2009, at 20:43, Tony Meyer wrote: > >> Is there anything about the existing public.pyzor.org that makes it >> unsuitable for use? i.e. is there anything that could be improved >> that would mean that you didn't see the need to have your own mirror? > > My answer is no. Indeed, no matter what I think about it it's no > because until today I managed to run my mail servers without a local > pyzor server (though I did think about installing one quite often ;) > during Pyzor's more troubled times of past years). > > Nevertheless, we do see occasional timeouts which I think a > disrupting the whole idea of Pyzor: > > $ grep -ci "reporting to pyzor services" $LOGSPAMJULY > 2409 > $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY > 226 > > That's a 9 % rate in July. > Is it just me here in Vienna, Austria? > No, I have seen it give timeouts up to a day at a time. I don't really complain, as my mail server supports just me and two other people. -- END OF LINE --MCP |
From: Dreas v. D. <dr...@sp...> - 2009-07-16 10:06:48
|
Robert Hajime Lanning wrote: > No, I have seen it give timeouts up to a day at a time. I don't really > complain, as my mail server supports just me and two other people. > Ok we have to look into that and fix it. Please send me a traceroute if possible offlist. Dreas |
From: Tony M. <to...@sp...> - 2009-07-16 10:27:54
|
> No, I have seen it give timeouts up to a day at a time. Do you mean that the server will time out any request for a whole day? Have you seen this recently? Nothing like this should ever happen, and we need to address the monitoring that is done to pick it up (I've certainly not seen anything like that) if it reoccurs. > I don't really complain, as my mail server supports just me and two other people. Please do feel free to post a note on the list here if timeouts are common. If we don't know about the issues (e.g. if the server monitoring isn't showing anything, and none of our servers are having trouble using the public server) then we don't know to fix them. You (or anyone else) can email me (to...@sp...) offlist if you prefer. Thanks, Tony |
From: Larry N. <py...@bl...> - 2009-07-16 19:40:35
|
I see a few of these errors every day. Yesterday there were about 50 out of about 5000 checks. Sometimes there are more, sometimes less. The errors seem to happen in streaks. Jul 16 11:19:48 server42 spamd[1051]: pyzor: check failed: internal error Nedry |
From: Benny P. <me...@ju...> - 2009-07-16 11:01:37
|
On Thu, July 16, 2009 11:00, Andreas Schamanek wrote: > $ grep -ci "reporting to pyzor services" $LOGSPAMJULY > 2409 > $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY > 226 > > That's a 9 % rate in July. > Is it just me here in Vienna, Austria? time to make pyzor client code with cache, or make server to server digest working, and help create more mirrors, acl should be so whitelist/report still is not public submits, unless its submitted to localhost or own server timeout can be adjusted in sa pyzor to wait more, or if you have local dns server this would help i see timeouts also but not that much -- xpoint |
From: Tony M. <to...@sp...> - 2009-07-16 21:41:58
|
> time to make pyzor client code with cache, Patches are always welcome! Alternatively, if you think that the client should cache responses, then please open a feature request ticket for that. I'm not 100% sure this is a good idea. Generally the client doesn't check multiple messages at once, so the cache would have to be stored on disk somewhere. How long should items last in the cache? Are clients checking duplicate messages in a short period of time often? Having the cached items long-lived seems problematic since other users could be reporting the same message again (i.e. increasing the hit count) or whitelisting the message. Would only hits be cached, or hits and whitelisted messages and misses? > or make server to server digest working, and help create more mirrors, If you want to submit a patch that enables mirroring, you're welcome to do so. I don't have any interest in working on that myself, sorry. The improved backend database engine code that's coming in the next release (already in SVN) does make it easy to do this in some cases (e.g. if you use a SQL database that supports some sort of replication, like MySQL, then you can just use that to distribute the data). However, unless you have a database cluster setup (where you can write to any server and have that spread out across the servers), you'd still have to set things up so that writes only went to the master. The database backend changes should make it both clearer and easier to write other backends, so that should also help (e.g. you might want to try writing one that uses Google App Engine, and use their BigTable storage as a way of having data in multiple places). > acl should be so whitelist/report still is not public submits, unless its submitted to localhost or own server Anyone running a server has complete control over the ACL. The public.pyzor.org ACL does not currently allow anonymous users to whitelist, but does allow anyone to report. I do want to have a discussion about this (but haven't got around to starting a thread yet). Personally, I think reporting has to stay completely public, or much of the value of pyzor will be lost. It seems like there needs to be more access to whitelisting, however. This is related to the other discussions about weighted scores and adding a new "score" command. > timeout can be adjusted in sa pyzor to wait more, The pyzor client also has a timeout, which can be set in the configuration file. Cheers, Tony |
From: Andreas S. <sch...@fa...> - 2009-07-28 15:26:26
|
Hi all, Right now (2009-07-28 15:00 UTC) I see some timeouts when reporting. On Thu, 16 Jul 2009, at 13:01, Benny Pedersen wrote: > On Thu, July 16, 2009 11:00, Andreas Schamanek wrote: > > $ grep -ci "reporting to pyzor services" $LOGSPAMJULY > > 2409 > > $ grep -ci "^public.pyzor.org:24441.*TimeoutError" $LOGSPAMJULY > > 226 > > > > That's a 9 % rate in July. However, I had the numbers wrong. It's actually a rate of a good 2 %. And I have seen less Timeouts since July 16. > time to make pyzor client code with cache FWIW, I am speaking of _pyzor report_ only. -- -- Andreas |
From: Tony M. <to...@sp...> - 2009-07-29 00:21:02
|
> Right now (2009-07-28 15:00 UTC) I see some timeouts when reporting. Thanks, I'll look into that. > FWIW, I am speaking of _pyzor report_ only. That's an interesting point. Do you mean that 'pyzor check' doesn't timeout, or just that you haven't checked to see whether it does or not. I have focused on making 'check' fast - it could well be that there are some easy solutions for making 'report' more responsive as well. Cheers, Tony |
From: Andreas S. <sch...@fa...> - 2009-07-29 10:06:28
|
On Wed, 29 Jul 2009, at 12:20, Tony Meyer wrote: > > FWIW, I am speaking of _pyzor report_ only. > That's an interesting point. Do you mean that 'pyzor check' doesn't > timeout, or just that you haven't checked to see whether it does or > not. Well, I am not seeing them. But maybe I am just not looking where I should. Or maybe I am missing a special log facility. Can anyone give me a hand? I am using pyzor from within spamassassin. At least, my mail.log files do not show any "TimeoutError". In July, I only see "pyzor: check failed: internal error" two times (just to prove that there is some logging;) > I have focused on making 'check' fast - it could well be that there > are some easy solutions for making 'report' more responsive as well. I have just run a check of about 1000 messages. It's not enough data for serious statements, however my gut feelings say that it ran a bit smoother than the reports. Cheerio, -- -- Andreas ReAlpine: https://sourceforge.net/projects/re-alpine/ Reborn Alpine continues UW's Alpine/Pine email client |