From: Michel B. <mi...@bo...> - 2005-06-06 14:34:37
|
Le Dimanche 05 Juin 2005 20:15, Lionel Bouton a =E9crit : > > >Other subject: Lionel, have you had time to think again about the > > tarpitting / throttling feature that I had suggested ? I still would = like > > it ;-) > > I'm not yet convinced it's a good idea. I'll sum up what I understand > about tarpitting below for you to point mistakes or missing points. > > Tarpitting (refusing to create new connect entries if there are already > <n> existing entries with the same source, with refinements to disable > tarpitting when domain_awl holds entries for the source or enough > entries exist in from_awl) could help preventing pollutions of the > connect table from one single src. Yes, and it might by the way prevent a possible attack against a greylist= ing=20 mailserver, as otherwise it would be easy for an evil system to flood a=20 connect table with thousands of entries or more... > This would have two main benefits : > B1/ faster DB access to the connect table (which probably is the most > used on common config under heavy Zombie pressure...) from SQLgrey. > B2/ easier analysis of the connect table by a curious sysadmin. and B3/ the connect table wouldn't grow too big on disk > Here are the risks I'm worried about : > R1/ using tarpitting will also interfere with legit mails that don't > match AWLs (more retries). I have already stated that I believe it probably wouldn't be an issue,=20 especially if using some refinements which you citate above. > R2/ SQLgrey will have to do another query on the connect table which > would most probably kill the performance advantage we get from a smalle= r > connect table Yes, that's one more query on "connect", but this will only affect=20 "newcomers", and not sources that are already AWL'd. Some time ago, you were considering adding supplementary tables (such as = a=20 "connect_awl" and even a "src_awl") and the fact that this needed=20 supplementary queries for most of the messages didn't seem to be a=20 show-stopper ;-) > and for the refinements, one query involving domain_awl=20 > and if needed another involving from_awl. Yes, but those "refinements" will be used only for sources which already = have=20 more than "n" records in connect, basically zombies. If we're under zombi= e=20 attack, the corresponding DB pages will probably be in-cache and the quer= ies=20 will be fast. And when the "refinements" are called for non-zombies (big sites), their = very=20 purpose it to make sure the messages will be accepted fast, so the=20 refinements won't be repeatedly used for long... > My main problem for me was with R1 until the number of queries piled up > to avoid the fact that in some cases (big ISPs consolidating email > infrastructure, new smtp-out adresses, ...) you might very well > introduce huge delays or even bounce mails (which was a show stopper fo= r > me). Now I think R2 will make the whole thing pointless. I still believe that this idea should be tried out to experiment how it=20 behaves in "real life". Avoiding zombie of virussed machines to pollute connect to a great deal, = and=20 making sure that a "connect flooding attack" cannot be done seems to me g= ood=20 enough reasons to make it interesting... --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |