You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
(75) |
May
(6) |
Jun
(6) |
Jul
(9) |
Aug
(46) |
Sep
(28) |
Oct
(56) |
Nov
(23) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(23) |
Feb
(13) |
Mar
(10) |
Apr
(11) |
May
(23) |
Jun
(9) |
Jul
(6) |
Aug
(20) |
Sep
(28) |
Oct
(1) |
Nov
(23) |
Dec
(1) |
2004 |
Jan
(9) |
Feb
(6) |
Mar
(3) |
Apr
(12) |
May
(14) |
Jun
(3) |
Jul
(2) |
Aug
(9) |
Sep
(3) |
Oct
(8) |
Nov
(43) |
Dec
(9) |
2005 |
Jan
|
Feb
(1) |
Mar
(5) |
Apr
(17) |
May
(4) |
Jun
(2) |
Jul
(3) |
Aug
(2) |
Sep
(7) |
Oct
(8) |
Nov
|
Dec
(3) |
2006 |
Jan
(4) |
Feb
(2) |
Mar
(6) |
Apr
(3) |
May
|
Jun
(31) |
Jul
(4) |
Aug
(3) |
Sep
(5) |
Oct
(19) |
Nov
(16) |
Dec
(9) |
2007 |
Jan
|
Feb
|
Mar
(6) |
Apr
|
May
|
Jun
|
Jul
(5) |
Aug
|
Sep
(23) |
Oct
(7) |
Nov
(6) |
Dec
|
2008 |
Jan
(9) |
Feb
|
Mar
|
Apr
(9) |
May
(11) |
Jun
|
Jul
(1) |
Aug
(1) |
Sep
(3) |
Oct
|
Nov
(10) |
Dec
|
2009 |
Jan
(3) |
Feb
|
Mar
(5) |
Apr
(26) |
May
(45) |
Jun
(16) |
Jul
(41) |
Aug
(25) |
Sep
(4) |
Oct
(1) |
Nov
(8) |
Dec
(5) |
2010 |
Jan
(1) |
Feb
(3) |
Mar
(2) |
Apr
(21) |
May
(4) |
Jun
(18) |
Jul
(3) |
Aug
(2) |
Sep
(12) |
Oct
|
Nov
|
Dec
(5) |
2011 |
Jan
|
Feb
(3) |
Mar
(6) |
Apr
|
May
(1) |
Jun
(3) |
Jul
|
Aug
(4) |
Sep
(3) |
Oct
(1) |
Nov
|
Dec
(9) |
2012 |
Jan
(6) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(4) |
Feb
|
Mar
(1) |
Apr
|
May
(4) |
Jun
(7) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(3) |
Jun
(3) |
Jul
(7) |
Aug
(1) |
Sep
(3) |
Oct
(2) |
Nov
(8) |
Dec
|
2015 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
(4) |
Sep
|
Oct
(2) |
Nov
(1) |
Dec
(5) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
From: Tony M. <to...@sp...> - 2009-06-02 21:37:18
|
> 1) how can the server get a wl count of 10000 at the first place for > an obvious piece of spam? (I have a handle of similar emails that > falls into the same case); This digest (778941d994b5281bf5652cd293a2761421cc109d) is a special case. Ticket #1037314 deals with this case. Basically, the only content that Pyzor finds to use for digesting for some types of message is: # pyzor predigest < ~/message.eml <!DOCTYPEHTMLPUBLICHTML4.0 Obviously, this isn't text that would be unique to a message. Until that ticket is resolved, both ham and spam can end up with the same digest. With a classifier like Pyzor (where the digests are meant to be unique), it is many times worse to get a false positive than a false negative. For that reason, I manually set the whitelist count to 10,000 for this one digest, so that until the ticket is resolved, messages of this type will never be classified as spam. That means that there will be a few spam that are missed, but no ham will be incorrectly classified as spam, which is vastly more important. > 2) the client seems to override the end result with even a whitelist > count of 1, judging from the source code. That's correct - this was also the case in 0.4 - I believe it has been true ever since Frank originally added the whitelist functionality. That's a decade before my time, but my guess would be that he felt that the whitelist functionality was necessary, but wanted to ensure that existing tools (perhaps the SpamAssassin plugin) continued to work. For example, the current SpamAssassin plug-in (which could well be the same code as when 0.4 was released) ignores the whitelist count completely. That means that unless the hit count is adjusted, the whitelisting would have no effect. Since authentication is required for the whitelist command, and a false positive is vastly worse (especially with a hash-based classifier) than a false negative, it seems a reasonable choice. Looking forward, my feeling (as outlined on the list previously), is that adding a new command ("score"), which combined the hit and whitelist counts to produce a 0-1 score, would be a useful addition. This would allow a more refined use of the two counts. I don't think it's right to adjust the current behaviour of the "check" command, since it has behaved that way for so long. If users wish to make use of the individual counts, they they can either do a check command without using the standard pyzor client (since it is the client that overrules the hit count, not the server), or use the info command and parse the result accordingly. Cheers, Tony |
From: Patrick Yu <ipa...@gm...> - 2009-06-02 19:08:14
|
Hi, When I switched the pyzor server to public.pyzor.org:24441, I noticed one particular issue with wl-count. Some of the obviously spam emails got pass the pyzor check. I manually run pyzor with -d to see what happened and here's what it was: sending: 'User: anonymous\nTime: 1243969244\nSig: 2158f78e71b072f84662000eb0a503909db355f5\n\nOp: check\nOp-Digest: 778941d994b5281bf5652cd293a2761421cc109d\nThread: 31683\nPV: 2.0\n\n' received: 'Thread: 31683\nCount: 2000\nWL-Count: 10000\nCode: 200\nDiag: OK\nPV: 2.0\n\n' public.pyzor.org:24441 (200, 'OK') 0 10000 The server seems to report a hit of 2000, and at the same time a whitelist count of 10000, and the client decides to report the result of 0! This is quite pizzling to me, as 1) how can the server get a wl count of 10000 at the first place for an obvious piece of spam? (I have a handle of similar emails that falls into the same case); 2) the client seems to override the end result with even a whitelist count of 1, judging from the source code. Are these normal behaviors to expect or if there's any reasons I am seeing this? -PY |
From: Patrick Yu <ipa...@gm...> - 2009-06-02 18:50:14
|
Hi, After upgraded to 0.5, pyzor fails to check some multipart emails which were perfectly fine with 0.4. pyzor -d [digest|check] just shows nothing. I paste an example email here: http://pastebin.com/f18a30da9 Anyone has an idea if it's a bug or what? For now, I am going to revert to 0.4. -PY |
From: Tony M. <to...@sp...> - 2009-06-01 23:16:58
|
Hi, Sorry about the slow response - it was a long weekend here and so I was busy doing other things. > pyzor 2>&1 info <msg > public.pyzor.org:24441 (500, "Internal Server Error: 'NoneType' object has > no attribute 'timetuple'") [...] > is info not supported anymore ? Sorry - I broke this when changing over to the new database backend. It worked if there was a time for both whitelist and hit count, and that's what I quickly tested info with. I'll test more thoroughly next time. It should be working properly again now. Cheers, Tony |
From: Benny P. <me...@ju...> - 2009-05-30 02:17:53
|
pyzor 2>&1 info <msg public.pyzor.org:24441 (500, "Internal Server Error: 'NoneType' object has no attribute 'timetuple'") 127.0.0.1:24441 (200, 'OK') Count: 1 Entered: Sat May 30 04:05:30 2009 Updated: Sat May 30 04:05:30 2009 WL-Entered: Never WL-Updated: Never is info not supported anymore ? -- http://localhost/ 100% uptime and 100% mirrored :) |
From: Chris <cpo...@em...> - 2009-05-15 22:36:34
|
On Fri, 2009-05-15 at 13:02 +1200, Tony Meyer wrote: > > Python was installed via rpm, I'm installing pyzor from your source > > file. > > Is this problem unique to pyzor, or are you able to install any Python > packages with distutils? Pyzor's installation is about as vanilla a > distutils install as you can get. > > Cheers, > Tony > Fixed now Tony, I was missing the devel libs which installed the /config dir and the necessary files. Chris -- KeyID 0xE372A7DA98E6705C |
From: Tony M. <to...@sp...> - 2009-05-15 01:09:15
|
> Python was installed via rpm, I'm installing pyzor from your source > file. Is this problem unique to pyzor, or are you able to install any Python packages with distutils? Pyzor's installation is about as vanilla a distutils install as you can get. Cheers, Tony |
From: Chris <cpo...@em...> - 2009-05-15 00:41:39
|
On Fri, 2009-05-15 at 11:05 +1200, Tony Meyer wrote: > > Installation issue: > > > > error: invalid Python installation: unable to open /usr/lib/python2.6/config/Makefile > > (No such file or directory) > > > > And that's right it doesn't exist, neither /config or a Makefile, any suggestions? > > This is on Mandriva 2009.1, installed via RPM's > > I'm afraid that we don't have anything to do with building Pyzor RPMs > (we only release plain Python distutils setups). You'll have to > contact whoever built the Mandriva RPM and ask them. > > Cheers, > Tony > Python was installed via rpm, I'm installing pyzor from your source file. -- KeyID 0xE372A7DA98E6705C |
From: Tony M. <to...@sp...> - 2009-05-14 23:05:36
|
> Installation issue: > > error: invalid Python installation: unable to open /usr/lib/python2.6/config/Makefile > (No such file or directory) > > And that's right it doesn't exist, neither /config or a Makefile, any suggestions? > This is on Mandriva 2009.1, installed via RPM's I'm afraid that we don't have anything to do with building Pyzor RPMs (we only release plain Python distutils setups). You'll have to contact whoever built the Mandriva RPM and ask them. Cheers, Tony |
From: Chris <cpo...@em...> - 2009-05-14 22:54:51
|
On Thu, 2009-05-07 at 20:59 +1200, Tony Meyer wrote: > The Pyzor team is pleased to announce release 0.5 of Pyzor. > Installation issue: error: invalid Python installation: unable to open /usr/lib/python2.6/config/Makefile (No such file or directory) And that's right it doesn't exist, neither /config or a Makefile, any suggestions? This is on Mandriva 2009.1, installed via RPM's -- KeyID 0xE372A7DA98E6705C |
From: Dreas v. D. <dr...@sp...> - 2009-05-14 14:08:03
|
Sven Hergenhahn wrote: >> What happens if you check SA with something that is large enough to be digested? >> > # spamassassin -D pyzor < > /root/.cpan/build/Mail-SpamAssassin-3.2.5/sample-spam.txt > [1909] dbg: pyzor: network tests on, attempting Pyzor > > [1909] dbg: pyzor: pyzor is available: /usr/bin/pyzor > > [1909] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir > /etc/mail/spamassassin check < /tmp/.spamassassin19092qpHaUtmp > [1909] dbg: pyzor: killed stale helper [1910] > > [1909] dbg: pyzor: [1910] terminated: exit=0x000f > > [1909] dbg: pyzor: check timed out after 3.5 seconds > > Still no real success, but some progress at least. Any ideas on how to > proceed from here? We discussed this off list (just posting to the list for others). SpamAssassin didn't manage to get the Pyzor response within 3.5 seconds and therefore timed out. Sven increased the time out, and that solved his issue. We'll try to determine why his checks are so slow (off list). Regards, Dreas |
From: Sven H. <sv...@gm...> - 2009-05-14 06:42:50
|
Hi Tony, Tony Meyer schrieb: >> # echo "test" | spamassassin -D pyzor >>... >> [18993] dbg: pyzor: check failed: no response > If the message is too small to have a digest, then SA is correct in > indicating that there is no response (the server never even gets > queried). Thanks foor this explanation. It makes things much clearer for me. > What happens if you check SA with something that is large enough to be digested? # spamassassin -D pyzor < /root/.cpan/build/Mail-SpamAssassin-3.2.5/sample-spam.txt [1909] dbg: pyzor: network tests on, attempting Pyzor [1909] dbg: pyzor: pyzor is available: /usr/bin/pyzor [1909] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir /etc/mail/spamassassin check < /tmp/.spamassassin19092qpHaUtmp [1909] dbg: pyzor: killed stale helper [1910] [1909] dbg: pyzor: [1910] terminated: exit=0x000f [1909] dbg: pyzor: check timed out after 3.5 seconds Still no real success, but some progress at least. Any ideas on how to proceed from here? Cheers, Sven |
From: Tony M. <to...@sp...> - 2009-05-13 20:53:55
|
> # echo "test" | spamassassin -D pyzor > [18993] dbg: pyzor: network tests on, attempting Pyzor > [18993] dbg: pyzor: pyzor is available: /usr/bin/pyzor > [18993] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir > /etc/mail/spamassassin check < /tmp/.spamassassin18993PRMdaxtmp > [18993] dbg: pyzor: [18994] finished: exit=0x0100 > > [18993] dbg: pyzor: check failed: no response You can't use "test" as a test string, because it's too small to have a digest, so there will never be a response (and nor should there - anything that small doesn't have enough data for a hash-based check to be reliable). In fact, since "test" is a (badly formed) header, you have no content in the body of the email at all. You need to have something in the body (i.e. at the very least a blank line indicating no headers), and you need to have a line that is at least 8 characters long when normalised. You can check if what you're passing is too short by checking if it has a digest. For example: ~$ echo "test" | pyzor digest ~$ echo " > testtestt" | pyzor digest 7c69679dbb3449b8052d6d14cbc826562c6191e7 If the message is too small to have a digest, then SA is correct in indicating that there is no response (the server never even gets queried). What happens if you check SA with something that is large enough to be digested? Cheers, Tony |
From: Sven H. <sv...@gm...> - 2009-05-13 12:37:00
|
Hi Draes, thanks for picking this up... Dreas van Donselaar schrieb: > Sven Hergenhahn wrote: >> # echo "test" | spamassassin -D pyzor >> [18993] dbg: pyzor: network tests on, attempting Pyzor >> [18993] dbg: pyzor: pyzor is available: /usr/bin/pyzor >> [18993] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir >> /etc/mail/spamassassin check < /tmp/.spamassassin18993PRMdaxtmp >> [18993] dbg: pyzor: [18994] finished: exit=0x0100 >> >> [18993] dbg: pyzor: check failed: no response >> ... > What Pyzor version are you using? The latest version 0.5.0 should > contain fixes for this issue. I am using 0.5.0. Cheers, Sven |
From: Dreas v. D. <dr...@sp...> - 2009-05-13 09:10:26
|
Sven Hergenhahn wrote: > # echo "test" | spamassassin -D pyzor > [18993] dbg: pyzor: network tests on, attempting Pyzor > [18993] dbg: pyzor: pyzor is available: /usr/bin/pyzor > [18993] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir > /etc/mail/spamassassin check < /tmp/.spamassassin18993PRMdaxtmp > [18993] dbg: pyzor: [18994] finished: exit=0x0100 > > [18993] dbg: pyzor: check failed: no response > ... > > any ideas why I'm getting this? > What Pyzor version are you using? The latest version 0.5.0 should contain fixes for this issue. Regards, Dreas |
From: Sven H. <sv...@gm...> - 2009-05-12 14:53:35
|
Hi everyone, I have the following: # cat /etc/mail/spamassassin/servers public.pyzor.org:24441 pyzor --homedir /etc/mail/spamassassin check < spamtestfile.txt public.pyzor.org:24441 (200, 'OK') 0 0 --> semms OK - agreed? but then: # echo "test" | spamassassin -D pyzor [18993] dbg: pyzor: network tests on, attempting Pyzor [18993] dbg: pyzor: pyzor is available: /usr/bin/pyzor [18993] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir /etc/mail/spamassassin check < /tmp/.spamassassin18993PRMdaxtmp [18993] dbg: pyzor: [18994] finished: exit=0x0100 [18993] dbg: pyzor: check failed: no response ... any ideas why I'm getting this? Thanks in advance, Sven |
From: Benedict W. <Ben...@cs...> - 2009-05-11 11:04:14
|
-----Original Message----- From: Tony Meyer [mailto:to...@sp...] Sent: 09 May 2009 22:12 To: pyz...@li... Subject: Re: Pyzor 0.5 Released >>> One other thing I would like to see is a configurable option to tell the pyzor client to >>> aggregate the blacklist/whitelist scores rather than leave that up to spamassassin. [...] > What I was hoping for was a configurable option on the client end > so that it would reply with score to a request for a check, so that > spamassassin does not need to be fiddled with. Hmm. I can see value in that. Possibly both could be done - i.e. the option could be "treat 'check' as 'score'". I'll keep that in mind. >> Yes, that was what I had in mind... BTW, Pyzor 0.5 seems to be working fine. Kind Regards Benedict White |
From: Tony M. <to...@sp...> - 2009-05-09 21:12:11
|
>>> One other thing I would like to see is a configurable option to tell the pyzor client to >>> aggregate the blacklist/whitelist scores rather than leave that up to spamassassin. [...] > What I was hoping for was a configurable option on the client end > so that it would reply with score to a request for a check, so that > spamassassin does not need to be fiddled with. Hmm. I can see value in that. Possibly both could be done - i.e. the option could be "treat 'check' as 'score'". I'll keep that in mind. Thanks, Tony |
From: Benedict W. <Ben...@cs...> - 2009-05-08 08:51:41
|
-----Original Message----- From: Tony Meyer [mailto:to...@sp...] Sent: 08 May 2009 02:02 To: pyz...@li... Subject: Re: Pyzor 0.5 Released >Peering isn't high on the list of features that I'm interested in >adding. The new server can handle the load without problem (although >I'd like to make some additional performance improvements). It seems >like quite a bit of extra complexity, for a reasonably small gain. >But if there is demand for this, then I can move it up the list. Well, it certainly is a lot of added complexity. It seems to me that you would want a seperate process handling peering and reports to that handling check requests. >> One other thing I would like to see is a configurable option to tell the pyzor client to >> aggregate the blacklist/whitelist scores rather than leave that up to spamassassin. >My current thinking is that a new command would be good (something >like 'score'), that returned a single number that combined the hit and >whitelist counts in some way. I agree there is value to having a way >for the pyzor client to give back a single number for ease-of-use. >However, I think the 'check' command should probably stay the same, so >that backwards compatibility is maintained. What I was hoping for was a configurable option on the client end so that it would reply with score to a request for a check, so that spamassassin does not need to be fiddled with. >Thanks for your thoughts - I've saved the peering comments so that I >can go back to them if/when peering comes up. No problem, as ever it is interesting thing about the best ways of doing things! Kind Regards Benedict White |
From: Tony M. <to...@sp...> - 2009-05-08 01:02:54
|
>> Pyzor initially started out to be merely a Python implementation of >> Razor, but due to the protocol and the fact that Razor's server is not >> Open Source or software libre, Frank Tobin decided to implement Pyzor >> with a new protocol and release the entire system as Open Source and >> software libre. > > which reminds me of a question: iz PYZOR yet compatible with RAZOR or > already not? As far as I know, they are still entirely separate protocols, as described in the paragraph above (which I took from the main page of the website). Is there demand for compatibility? If I understand correctly, Razor is free to use now, so you could use both Razor and Pyzor if you wanted to. Cheers, Tony |
From: Tony M. <to...@sp...> - 2009-05-08 01:02:24
|
The website does say that Frank planned on adding peering. I'm not totally sure on why he wanted that - perhaps it just seemed more open and future proof, or perhaps he felt it was the only way to scale. (I really must try and find more historical information so that I can know these things!). Peering isn't high on the list of features that I'm interested in adding. The new server can handle the load without problem (although I'd like to make some additional performance improvements). It seems like quite a bit of extra complexity, for a reasonably small gain. But if there is demand for this, then I can move it up the list. > One other thing I would like to see is a configurable option to tell the pyzor client to > aggregate the blacklist/whitelist scores rather than leave that up to spamassassin. My current thinking is that a new command would be good (something like 'score'), that returned a single number that combined the hit and whitelist counts in some way. I agree there is value to having a way for the pyzor client to give back a single number for ease-of-use. However, I think the 'check' command should probably stay the same, so that backwards compatibility is maintained. Thanks for your thoughts - I've saved the peering comments so that I can go back to them if/when peering comes up. Cheers, Tony |
From: Benedict W. <Ben...@cs...> - 2009-05-07 13:11:18
|
Many appologies for my first reply which had no word wrapping in it at all. I thought wrongly that I had got Outlook to do what I wanted it to do but I was very ver wrong so here is the same email, but formatted in Notepad. -----Original Message----- From: Matus UHLAR - fantomas [mailto:uh...@fa...] Sent: 07 May 2009 12:15 To: pyz...@li... Subject: Re: Pyzor 0.5 Released > We are aiming to release the next version of Pyzor, which will include > new features, around the end of June (2009!). If you'd like to have > input into that release, please subscribe to the pyzor-users mailing > list, or monitor the SourceForge tickets for the Pyzor project. We > are very keen to have as much input from the user-base as possible. On 07.05.09 10:46, Benedict White wrote: > Well, if you are looking to do peering of servers, I have some ideas > there, though they do add a level of complexity on the server side. please configure your mailer to wrap lines below 80 characters per line. 72 to 75 is usually OK. I've had to rewrap your posting for better reading. >> Sorry! > The only issue to deal with is identifying when the same server has > had multiple reports of the same spam (which this scheme so far does > not allow > for) ...and this is very important since the same servers may be flooded by same/similar spam >> Yes, as in I report one instance of a spam, so does someone else, and then someone else reports the same spam 3 times as they have got it 3 times. >> > and that could be dealt with by adding two more bits of information, > (or possibly combining it) some hit count, and also a version > number/timestamp of some sort. I'm currently only thinking about the (count checksum server1 server2 ...) lists. Optimizations may come later. >> There is going to have to be two databases involved. One to hold scores and one to deal with peering information. this is because server A which has had 5 of the same emails reported to it will need to hold that seperately so that it can then pass that report on, whilst telling clients who are checking for spam how many reports have been received by the network or peered servers. We also have to remember that my server may have had two instances of a spam reported that I then pass on in the form 2:<some hash>:my-server-id but then I receive one or more reports of the same spam so need to send a new report to the peers with an increased count, hence the need for somekind of stamp (either time or version based) to say that I am updating my last report with a more up to date report. Also the save checking, (in the case of rings of peers) it makes sense to put the order of servers starting with the origin server, so that If I want to check if I have had the report I only need check the checksum, origin server and stamp/version number. If I have seen this already (via another server) I can discount the report quickly. > Also there would need to be a flag saying that the hash is either spam > or ham. you mean spam/whitelist according to current naming. One of things I'm thinking about is using multiple checksum algorithms. >> Yes I do. By multiple checksum algorithums do you mean that checking for a whitelist will use a different routine (and therefore produce a different answer) to the one for spam? > One other thing I would like to see is a configurable option to tell > the pyzor client to aggregate the blacklist/whitelist scores rather > than leave that up to spamassassin. asking for "score" it would be. >> I have not looked at what spamassassin asks Pyzor to provide. What I was hoping for was the ability to tell pyzor to provide the current spamassassin with either a spam count or a spam count - whitelist count. That said maybe I just need to see if there is a fix for spamassassin. Kind Regards Benedict White |
From: Benedict W. <Ben...@cs...> - 2009-05-07 12:07:28
|
-----Original Message----- From: Matus UHLAR - fantomas [mailto:uh...@fa...] Sent: 07 May 2009 12:15 To: pyz...@li... Subject: Re: Pyzor 0.5 Released > We are aiming to release the next version of Pyzor, which will include > new features, around the end of June (2009!). If you'd like to have > input into that release, please subscribe to the pyzor-users mailing > list, or monitor the SourceForge tickets for the Pyzor project. We > are very keen to have as much input from the user-base as possible. On 07.05.09 10:46, Benedict White wrote: > Well, if you are looking to do peering of servers, I have some ideas there, > though they do add a level of complexity on the server side. please configure your mailer to wrap lines below 80 characters per line. 72 to 75 is usually OK. I've had to rewrap your posting for better reading. >> Sorry! > The only issue to deal with is identifying when the same server has had > multiple reports of the same spam (which this scheme so far does not allow > for) ...and this is very important since the same servers may be flooded by same/similar spam >> Yes, as in I report one instance of a spam, so does someone else, and then someone else reports the same spam 3 times as they have got it 3 times. >> > and that could be dealt with by adding two more bits of information, (or > possibly combining it) some hit count, and also a version number/timestamp > of some sort. I'm currently only thinking about the (count checksum server1 server2 ...) lists. Optimizations may come later. >> There is going to have to be two databases involved. One to hold scores and one to deal with peering information. this is because server A which has had 5 of the same emails reported to it will need to hold that seperately so that it can then pass that report on, whilst telling clients who are checking for spam how many reports have been received by the network or peered servers. We also have to remember that my server may have had two instances of a spam reported that I then pass on in the form 2:<some hash>:my-server-id but then I receive one or more reports of the same spam so need to send a new report to the peers with an increased count, hence the need for somekind of stamp (either time or version based) to say that I am updating my last report with a more up to date report. Also the save checking, (in the case of rings of peers) it makes sense to put the order of servers starting with the origin server, so that If I want to check if I have had the report I only need check the checksum, origin server and stamp/version number. If I have seen this already (via another server) I can discount the report quickly. > Also there would need to be a flag saying that the hash is > either spam or ham. you mean spam/whitelist according to current naming. One of things I'm thinking about is using multiple checksum algorithms. >> Yes I do. By multiple checksum algorithums do you mean that checking for a whitelist will use a different routine (and therefore produce a different answer) to the one for spam? > One other thing I would like to see is a configurable option to tell the > pyzor client to aggregate the blacklist/whitelist scores rather than leave > that up to spamassassin. asking for "score" it would be. >> I have not looked at what spamassassin asks Pyzor to provide. What I was hoping for was the ability to tell pyzor to provide the current spamassassin with either a spam count or a spam count - whitelist count. That said maybe I just need to see if there is a fix for spamassassin. Kind Regards Benedict White |
From: Matus U. - f. <uh...@fa...> - 2009-05-07 11:14:50
|
> We are aiming to release the next version of Pyzor, which will include > new features, around the end of June (2009!). If you'd like to have > input into that release, please subscribe to the pyzor-users mailing > list, or monitor the SourceForge tickets for the Pyzor project. We > are very keen to have as much input from the user-base as possible. On 07.05.09 10:46, Benedict White wrote: > Well, if you are looking to do peering of servers, I have some ideas there, > though they do add a level of complexity on the server side. please configure your mailer to wrap lines below 80 characters per line. 72 to 75 is usually OK. I've had to rewrap your posting for better reading. > I was thinking that the best way to peer reports is similar to the way INN > works for transferring usenet messages, which is that each message has a > message ID (Perhaps we would use the message hash) and a server ID, which > is the IP of the server which originally peered that hash (as in the one > that got the report. That way if you get a server which sees the same > report twice, it counts it only once. If the same mail is reported from > elsewhere it would have a different server ID so would add to the number > of times that email had been reported. s/INN/Usenet news/ a bit more into the architecture: - each message contains informations about how which servers it has travelled so each server may know where to send it. - each server remembers which Message-IDs it has seen so it will reject duplicates (causes problems if someone generates invalid Message-Id, so this may be left up to the receiving NNTP server) Yes, architecture for pyzor could be similar, but much different, since the same checksum can be received multiple times by multiple servers, and they must forward the numbers to other servers, preventing of transferring the same content to server that has already seen it. DCC apparently has architecture for distributing checksums across redundant network. > The only issue to deal with is identifying when the same server has had > multiple reports of the same spam (which this scheme so far does not allow > for) ...and this is very important since the same servers may be flooded by same/similar spam > and that could be dealt with by adding two more bits of information, (or > possibly combining it) some hit count, and also a version number/timestamp > of some sort. I'm currently only thinking about the (count checksum server1 server2 ...) lists. Optimizations may come later. > Also there would need to be a flag saying that the hash is > either spam or ham. you mean spam/whitelist according to current naming. One of things I'm thinking about is using multiple checksum algorithms. > One other thing I would like to see is a configurable option to tell the > pyzor client to aggregate the blacklist/whitelist scores rather than leave > that up to spamassassin. asking for "score" it would be. -- Matus UHLAR - fantomas, uh...@fa... ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Linux - It's now safe to turn on your computer. Linux - Teraz mozete pocitac bez obav zapnut. |
From: Benedict W. <Ben...@cs...> - 2009-05-07 09:46:57
|
-----Original Message----- From: Tony Meyer [mailto:to...@sp...] Sent: 07 May 2009 09:59 To: pyt...@py...; pyz...@li...; pyz...@li... Subject: Pyzor 0.5 Released The Pyzor team is pleased to announce release 0.5 of Pyzor. >> Well done! >> We are aiming to release the next version of Pyzor, which will include new features, around the end of June (2009!). If you'd like to have input into that release, please subscribe to the pyzor-users mailing list, or monitor the SourceForge tickets for the Pyzor project. We are very keen to have as much input from the user-base as possible. >> Well, if you are looking to do peering of servers, I have some ideas there, though they do add a level of complexity on the server side. I was thinking that the best way to peer reports is similar to the way INN works for transferring usenet messages, which is that each message has a message ID (Perhaps we would use the message hash) and a server ID, which is the IP of the server which originally peered that hash (as in the one that got the report. That way if you get a server which sees the same report twice, it counts it only once. If the same mail is reported from elsewhere it would have a different server ID so would add to the number of times that email had been reported. The only issue to deal with is identifying when the same server has had multiple reports of the same spam (which this scheme so far does not allow for) and that could be dealt with by adding two more bits of information, (or possibly combining it) some hit count, and also a version number/timestamp of some sort. Also there would need to be a flag saying that the hash is either spam or ham. One other thing I would like to see is a configurable option to tell the pyzor client to aggregate the blacklist/whitelist scores rather than leave that up to spamassassin. Just my 2 cents worth. Kind regards Benedict White |