From: Lionel B. <lio...@bo...> - 2005-02-07 14:20:45
|
Michel Bouissou wrote the following on 02/07/2005 02:31 PM : >Le Dimanche 06 F=E9vrier 2005 16:13, Lionel Bouton a =E9crit : > =20 > >>I'm not inclined to add stuff just because it isn't a big deal, especia= lly >>in the database schema which is the kind of thing I learned to change w= ith >>caution.=20 >> =20 >> > >Sure, but the database schema will have to change anyway (to include=20 >first_seen and rename an IP address field). So it would be the good mome= nt to=20 >add one more field that costs little. Changing important fields in a dat= abase=20 >schema must be done with caution, I agree, but adding a purely informati= ve=20 >field (that won't be used as a key or calculation base or whatever) has = no=20 >consequences... > > =20 > This makes sense to me. But there are so many purely informative fields.=20 For example it just occured to me that you *may* want to have a=20 "previously_seen" field in order to do queries like that : SELECT sender_domain, host_ip, last_seen - previously_seen FROM=20 domain_awl ORDER BY (last_seen - previously_seen) LIMIT 50; I'd even argue that this will be more useful than a counter field... but=20 still less useful than a log parsing tool. >>Look at the TODO, there are already several things with a clear need... >> =20 >> > >Yes. About the todo, a couple of remarks : > >1/ I object against integrating SPF in any way in SQLgrey. SPF and greyl= isting=20 >are completely different systems, with different goals and approaches. S= PF is=20 >implemented in separate patches (I use a Postfix patch) or policy server= s. I=20 >don't see the interest of integrating a goat and a cow together ;-) and = using=20 >SPF to determine whether or not greylisiting should be applied would sur= ely=20 >be an easy way for spammers to defeat greylisting... > =20 > It may be, this entry is only a reminder for me. I know for sure that=20 blindly trusting SPF is a no-no, the "experiment" only means that I'm=20 wondering if SQLgrey rejecting SPF invalid senders instead of=20 greylisting them may be useful (the question is merely to find out if=20 there's a point combining both informations outside Postfix in the=20 policy server or not) or if relying on a separate policy server is the=20 way to go (and document this in the HOWTO). Don't pay too much attention=20 to this TODO entry. >2/ I still would love to get sender and recipient based whitelisting in=20 >SQLgrey. Using Postfix tables for this purpose is not a satisfactory=20 >solution, for one can have a whole series of tests in Postfix, and diffe= rent=20 >exceptions for each kind of test. One may want to skip greylisting for s= ome=20 >sender (i.e. somebody@somedomain), but for example still want to perform= SPF=20 >tests on somedomain. Using a Postfix table with "somebody@somedomain =3D= > OK"=20 >would cause *all* subsequent tests to be skipped for this message, not o= nly=20 >greylisting. And it makes it a headache in ordering tests if using diffe= rent=20 >Postfix tables for this... > =20 > I don't find test ordering in Postfix the most intuitive thing either :-) >It would sound logical and easier to me that each "policy server" embark= s its=20 >own independent whitelisting for conditions under which this given test=20 >should be performed or not... > > =20 > For recipients I'm more than OK with it (this is the opt-in and opt-out=20 TODO entry). For senders, as I already said, I see it as a big hole in the=20 greylisting process. > =20 > >>In my opinion a separate log parsing tool would bring far more useful >>stats. >> =20 >> > >Sure, a log parsing tool is most useful, and probably most mail admins h= ave=20 >something like this. But a counter gives *different* information that ca= n be=20 >seen in the databaseat a glimpse, i.e. "is this sender an usual, frequen= t=20 >correspondent, or did he send only once" ? (as some spammers or viruses = do,=20 >and yes, sometimes, they can pass thru greylisting)... > >The counter would allow, for example, to easily extract the ratio of sen= der=20 >that have been seen only once compared to the ratio of "repeating" sende= rs=20 >present in the database. For analyzing the database, this is useful (and= easy=20 >to get), and a log parsing tool won't give this information. > =20 > Now, that's more an argument I can understand for storing this=20 information. But won't someone prefer a "previously_seen" (which by the=20 way is slightly more complex to implement) ? If the entry can't be found=20 more than once in the logs covering the awl ttl period, you'll have=20 nearly the same information... Lionel. |