|
From: Michael S. <Mic...@lr...> - 2005-06-09 16:35:32
|
On Wed, 8 Jun 2005, Michel Bouissou wrote:
> Le Mercredi 08 Juin 2005 11:05, Lionel Bouton a =E9crit :
> >
> > Michel, could you give us a ratio [...]
>
> > If other users could fetch Michel's build and test it in the same manne=
r
> > too that would be great.
>
> Everybody can easily figure out if it could save many entries in their co=
nnect
> table by performing manually a simple sql query such as :
>
> select src, count(*) as cpt from connect group by src having cpt >=3D 3 o=
rder by
> cpt desc, src;
>
> (replace >=3D 3 with any value you would consider for setting the tarpitt=
ing
> threshold)
>
>
Here are my values using above select statement:
number of entries in connect: 1.072.022
number of different IP addresses in connect: 110.904
average number of entries per IP address: 9.67
max. number of entries per IP address: 2.470
thrott. | num of | num of | num of | left | % reduc
num. | IP addr | entries | thrott | entries |
=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=
=3D=3D=3D=3D=3D
3 | 55.366 | 1.001.789 | 891.057 | 180.965 | 83.12 %
5 | 42.124 | 956.904 | 788.408 | 283.614 | 73.54 %
10 | 29.092 | 870.638 | 608.810 | 463.212 | 56.79 %
20 | 13.186 | 672.086 | 421.552 | 650.470 | 39.32 %
30 | 7.908 | 549.151 | 319.819 | 752.203 | 29.83 %
40 | 5.367 | 463.256 | 253.943 | 818.079 | 23.69 %
50 | 3.696 | 389.663 | 208.559 | 863.463 | 19.45 %
60 | 2.432 | 323.176 | 179.688 | 892.334 | 16.76 %
70 | 1.928 | 290.926 | 157.894 | 914.128 | 14.73 %
80 | 1.605 | 266.954 | 140.159 | 931.863 | 13.07 %
90 | 1.367 | 246.908 | 125.245 | 946.777 | 11.68 %
100 | 1.164 | 227.773 | 112.537 | 959.485 | 10.50 %
thrott. num.: number of entries where throttling begins
num of IP addr: number of unique IP addresses =3D number of lines of above
select statement
num of entries: total number of entries from select statement
num of thrott: num of entries - (thrott. num. - 1) * num of IP addr
left entries: number of entries in connect - num of thrott;
% reduc: num of thrott * 100 / number of entries in connect
This means, throttling would really decrease the size of our connect
table and hopefully the chance from spam to get through.
My primary goal was to reduce the delay for the regular messages. But
after this I wanted to look at algorithms which would reduce the number of
spams. Throttling would have been my first try to reduce spams. But since
I had not though how an algorithm could work, its great that Michel
already did the work.
However, I would not incorporate this algorithm into 1.6.0 but in 1.7.0.
If we put the other tables into sqlgrey about which I talked already, the
algorithm for throttling must be adapted. But even if not, I am not sure
if the algorithm is flexible enough. For example, lets assume the value of
connect_src_throttle is 21 and the value of group_domain_level is 10.
- if there is one or two entries in domain_awl, a new triple would be
accepted.
- if there are 20 entries in connect as well as in from_awl and 0 in
domain_awl, a new triple would be throttled, but 20 entries in from_awl
should be as good as 2 entries in domain_awl because of
group_domain_level.
Therefore a possible change to the algorithm would be to incorporate the
relation between from_awl and domain_awl, something like:
# Throttling too many connections from same new host
if (defined $self->{sqlgrey}{connect_src_throttle} and
$self->{sqlgrey}{connect_src_throttle} > 0 and
$self->count_src_connect($cltid) >=3D $self->{sqlgrey}{connect_src_thro=
ttle}) {
# without the following tests a good chance exists to loose emails for
# a new server of a big ISP
my $threshold =3D connect_src_throttle - $self->count_src_domain_awl($c=
ltid) * group_domain_level;
if ($threshold > 0) {
=09$threshold -=3D $self->count_src_from_awl($cltid));
=09if ($threshold > 0) {
=09 $self->mylog('grey', 2, "throttling: $cltid, $sender_name\@$sender_d=
omain -> $recipient");
=09 return ($self->{sqlgrey}{reject_first} . ' Throttling too many conne=
ctions from new source - ' .
' Try again later. ');
=09}
}
}
BTW, this code sniplet is not tested!
Michael Storz
-------------------------------------------------
Leibniz-Rechenzentrum ! <mailto:St...@lr...>
Barer Str. 21 ! Fax: +49 89 2809460
80333 Muenchen, Germany ! Tel: +49 89 289-28840
|