Re: [Sqlgrey-users] MySQL: first_seen updated when it shouldn't

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Michel Bouissou wrote:

>Le Dimanche 05 Juin 2005 20:15, Lionel Bouton a =E9crit :
> =20
>
>>>Other subject: Lionel, have you had time to think again about the
>>>tarpitting / throttling feature that I had suggested ? I still would l=
ike
>>>it ;-)
>>>     =20
>>>
>>I'm not yet convinced it's a good idea. I'll sum up what I understand
>>about tarpitting below for you to point mistakes or missing points.
>>
>>Tarpitting (refusing to create new connect entries if there are already
>><n> existing entries with the same source, with refinements to disable
>>tarpitting when domain_awl holds entries for the source or enough
>>entries exist in from_awl) could help preventing pollutions of the
>>connect table from one single src.
>>   =20
>>
>
>Yes, and it might by the way prevent a possible attack against a greylis=
ting=20
>mailserver, as otherwise it would be easy for an evil system to flood a=20
>connect table with thousands of entries or more...
>
> =20
>

Thousands of entries shouldn't be a problem (there are configurations
out there with 10s of thousands of entries in the connect table). In
fact hundreds of thousands of entries should be common on big
configurations.
You can already consider that ISPs are under DDOS from Windows Zombies...

Today an attack on the connect table is mostly theoritical. To do such
an attack, the attacker must :
- know at least one valid rcpt address (on configs that don't use
"reject_unlisted_recipient" before greylisting, they should do so),
- make hundreds of thousands of SMTP transactions to Postfix with
different "FROM" headers. On most configurations you probably can't make
more than 10s of such transactions/s. So it should take around 3 hours
of constant hammering to start making a difference on the DB.

If I were an attacker I would try to flood with TCP connections only. I
just checked : Postfix doesn't put a limit on connnections/host, you can
easily tie up all connections from a single IP and DoS it in a second in
the default configuration at least. So why bother with a slower way that
requires both to know valid rcpts and that the target runs SQLgrey ?

>>This would have two main benefits :
>>B1/ faster DB access to the connect table (which probably is the most
>>used on common config under heavy Zombie pressure...) from SQLgrey.
>>B2/ easier analysis of the connect table by a curious sysadmin.
>>   =20
>>
>
>and B3/ the connect table wouldn't grow too big on disk
>
> =20
>

True, but mostly connected to B1 (disk space is cheap).

>>Here are the risks I'm worried about :
>>R1/ using tarpitting will also interfere with legit mails that don't
>>match AWLs (more retries).
>>   =20
>>
>
>I have already stated that I believe it probably wouldn't be an issue,=20
>especially if using some refinements which you citate above.
>
> =20
>
>>R2/ SQLgrey will have to do another query on the connect table which
>>would most probably kill the performance advantage we get from a smalle=
r
>>connect table
>>   =20
>>
>
>Yes, that's one more query on "connect", but this will only affect=20
>"newcomers", and not sources that are already AWL'd.
>
>Some time ago, you were considering adding supplementary tables (such as=
 a=20
>"connect_awl" and even a "src_awl") and the fact that this needed=20
>supplementary queries for most of the messages didn't seem to be a=20
>show-stopper ;-)
>
> =20
>

The benefit of these two awls was clear to me: better AWL performance
(in fact there was a third awl considered: rcpt_awl). This was backed by
real-life cases (forward services that would be handled by rcpt_awl,
spam that use a weakness of from_awl removed by connect_awl, mail relays
of big ISPs that would be handled by src_awl).

>>and for the refinements, one query involving domain_awl=20
>>and if needed another involving from_awl.
>>   =20
>>
>
>Yes, but those "refinements" will be used only for sources which already=
 have=20
>more than "n" records in connect, basically zombies. If we're under zomb=
ie=20
>attack, the corresponding DB pages will probably be in-cache and the que=
ries=20
>will be fast.
>
>And when the "refinements" are called for non-zombies (big sites), their=
 very=20
>purpose it to make sure the messages will be accepted fast, so the=20
>refinements won't be repeatedly used for long...
>
> =20
>

Makes sense.

>>My main problem for me was with R1 until the number of queries piled up
>>to avoid the fact that in some cases (big ISPs consolidating email
>>infrastructure, new smtp-out adresses, ...) you might very well
>>introduce huge delays or even bounce mails (which was a show stopper fo=
r
>>me). Now I think R2 will make the whole thing pointless.
>>   =20
>>
>
>I still believe that this idea should be tried out to experiment how it=20
>behaves in "real life".
>
>Avoiding zombie of virussed machines to pollute connect to a great deal,=
 and=20
>making sure that a "connect flooding attack" cannot be done seems to me =
good=20
>enough reasons to make it interesting...
>
> =20
>

I'm not yet convinced this would be really useful, probably harmless thou=
gh.

Lionel.