Re: [Sqlgrey-users] Feature suggestion: kind of tarpitting

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Lionel Bouton a =E9crit :
>
> You forget

No I don't ;-)

> that big ISPs relay lots of mails with senders in more or less random
> domains (and there is legit mail in there). This is a very smal
> percentage but given the volumes, it amounts to lots of emails -> this
> can very well polute your connect table more than you want.

This really depends upon the quantity of legit traffic your own mailserve=
r
receives... If your mailserver has modest or moderate traffic, it will
never receive "tons of legit mail" from any given server, and especially
not tons of new unknown traffic from a single IP in a short period.

> If we assume that the domain_awl will kick in properly for the ISPs
> regular domains and an average 30 minutes retry, you will allow only 5
> days * 48 retry period * 10 mails per ISP that won't match the domain_a=
wl
>  : 2400 mails only from ISPa to ISPb.

First don't focus on the "10" number. It is an example, meant as
representing a user-settable parameter. It could well be 50 or 100,
depending upon the expected amount of incoming "new unknown traffic" at a
given site.

Then your calculation is false. It is _not_ "2400 mails only from ISPa to
ISPb", it is "2400 NEW SENDERS from ISPa to ISPb". Once ONE message from =
a
given sender is accepted, the sender goes immediately to from_awl, and
further messages from the same sender won't be delayed nor greylisted
anymore.

2400 messages vs. 2400 NEW SENDERS make very different figures...

(And I repeat the I suggest that this simple new feature should be
optional. One who doesn't like it shouldn't have to use it...)

> If you put SQLgrey between two ISPs with 10-100 millions mail/day each,
> there are high chances that you will end up with mails that can't come
> through.

ISPs with such traffics (what percentage of them among current SQLgrey
users ?) would probably choose not to use this feature... Or maybe they
would ;-)

> The main problem is that you'd want the limit to be dynamic based on th=
e
> trafic you are accepting from this IP. I don't think a fixed limit will
> do.

I don't share this point of view, because of the difference between "new
messages" and "new senders" explained above.

>> I think that to determine whether or not this may cause problems, we
>> have to try it out.

> From the look of your results to this query, it seems the spambot reuse=
s
>  the same source address quite often. We could refine the tarpiting wit=
h
> a check on the src and sender_* columns

I think this would weaken the system and limit its interest...

> that's unusual to have one sender sending more than a couple of mails t=
o
> the same MX in less than 20 minutes and I can't think of any sane perso=
n
> sending them by the dozens in the same amount of time...

Take as example a legit mailing-list (or newsletter) that changes its
source address (as DSPAM MLs recently did), and a big ISP can receive ton=
s
of simultaneous "new unknown" traffic from this ML for different
recipients at the same time. (But the tarpitting system wouldn't hurt her=
e
anywaay: Once the first message retry comes back, the sender ML goes to
from_awl, and the following messages get accepted without further delays)=
.

--=20
Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E
Appel de 200 Informaticiens pour le NON au Trait=E9 Constitutionnel
Europ=E9en: http://www.200informaticiens.ras.eu.org