Thread: [Sqlgrey-users] Performance: strict versus loose expiration time

Brought to you by: gyver, ludvigm, rebum

sqlgrey-users

[Sqlgrey-users] Performance: strict versus loose expiration time

From: Michael S. <Mic...@lr...> - 2007-02-22 09:16:10

Hi Dan and Lionel,

the last days I was thinking about the performance impact of implementing
a strict versus a loose expiration time. At the moment, Sqlgrey uses a
mixed implementation, the is_in_* subroutines use a strict implementation,
whereas the count subroutines use a loose implementation.

strict means, every select has a condition in the where-clause which
results in returning only active entries from a table.

loose means, there is no such condition in the where-clause. Expiration is
handled only by deleting expired entries from a table via a cleanup
subroutine.

The patches I sent Dan, implement a strict expiration time for all
subroutines.

However, implementing loose expiration time instead of strict, could bring
some performance gains.

If a select does not have the expiration condition in the where-clause, it
will be characterwise identical for the same triplet (or part of it). This
allows to use prepare_cached instead of prepare, which gives us the first
performance gain. In the next step this select statement will hit the
query cache as long as the table involved hasn't changed. This could give
us a second gain especially for the tables from_awl or domain_awl which do
not change often.

In the loose implementation the maximum expiration time would be
db_cleandelay seconds longer than the strict one. Taking the default this
means it is just half an hour longer, which is at least not noticeable for
the AWLs.

The only problem I see at the moment is a race condition between cleanup
and is_in/update, where the update could try to update a deleted entry.
Therefore update must take this into account similar to the put_in
subroutines which take into account the possibility of the insertion of an
entry by another Sqlgrey daemon.

What is your opinion?

Michael Storz
--
======================================================
Leibniz-Rechenzentrum      |    <mailto:St...@lr...>
Boltzmannstr. 1            |    Fax: +49 89 35831-9700
85748 Garching / Germany   |    Tel: +49 89 35831-8840
======================================================

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Dan F. <da...@ha...> - 2007-02-28 00:10:17

Michael Storz wrote:

> db_cleandelay seconds longer than the strict one. Taking the default this
> means it is just half an hour longer, which is at least not noticeable for
> the AWLs.
>
>   
Well. We cant really assume what ppl use for db_cleandelay.. I use 60 
seconds. Someone might do it once every week. You never know ;).

> In the next step this select statement will hit the
> query cache as long as the table involved hasn't changed. This could give
> us a second gain especially for the tables from_awl or domain_awl which do
> not change often.
>   
You mean the query-cache like the one build into newer MySql's?
I dont know excactly how thoose work, but why should there be any 
difference when the actual where statement doenst change. I dont see 
anywhere that ie. a timestamp is stuck into the statement. Its all 
static information and thus, in my head, should still hit the query cache.

Most select needs a WHERE statment. Eg. WHERE sender_name =, WHERE src 
=, ect.ect. So i dont see how adding the max_connect_age, 
reconnect_delay or awl_age values (which is static) will change how the 
cache works.


- Dan

ps. Based on this, are you saying i shouldnt apply the "is_active" patch 
you made?

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Michael S. <Mic...@lr...> - 2007-02-28 15:39:43

On Wed, 28 Feb 2007, Dan Faerch wrote:

> Michael Storz wrote:
>
> > db_cleandelay seconds longer than the strict one. Taking the default this
> > means it is just half an hour longer, which is at least not noticeable for
> > the AWLs.
> >
> >
> Well. We cant really assume what ppl use for db_cleandelay.. I use 60
> seconds. Someone might do it once every week. You never know ;).

We use 1 hour. I think, because as you said, we do not know what people
are using for db_cleandelay, we should either provide a strict
implementation or the possibility for the user to choose which
implementation he wants. As I said in another email, I think I will be
able to provide a patch for such a choice.

>
> > In the next step this select statement will hit the
> > query cache as long as the table involved hasn't changed. This could give
> > us a second gain especially for the tables from_awl or domain_awl which do
> > not change often.
> >
> You mean the query-cache like the one build into newer MySql's?

Right, since version 4.0.1 MySQL has a query cache.

> I dont know excactly how thoose work, but why should there be any
> difference when the actual where statement doenst change. I dont see
> anywhere that ie. a timestamp is stuck into the statement. Its all
> static information and thus, in my head, should still hit the query cache.
>
> Most select needs a WHERE statment. Eg. WHERE sender_name =, WHERE src
> =, ect.ect. So i dont see how adding the max_connect_age,
> reconnect_delay or awl_age values (which is static) will change how the
> cache works.
>

The query cache uses the select statement as a "text" key. That means,
using 'select' or 'SELECT' is already a different statement. If we
implement strict expiration time, then every select of the is_in or count
subroutines will have a timestamp:

Example from sub is_in_from_awl:

    my $sth = $self->prepare("SELECT 1 FROM $from_awl " .
                             'WHERE sender_name = ? ' .
                             'AND sender_domain = ? ' .
                             'AND src = ? ' .
                             'AND last_seen > ' .
                             $self->past_tstamp($self->{sqlgrey}{awl_age},
                                                'DAY')
                            );
If now() is 2007-02-28 16:41:08, for MySQL you will get the statement

SELECT 1 FROM from_awl WHERE sender_name = ? AND sender_domain = ? AND src
= ? AND last_seen > timestamp '2007-02-28 16:11:08' - INTERVAL 36 DAY

One second later, now changes to 2007-02-28 16:41:09 and you have a
textual different select statement. For a loose implementation the part
'AND last_seen > timestamp '2007-02-28 16:11:08' - INTERVAL 36 DAY' will
be ommitted and you always get the same statement (for the same src and
sender).


>
> - Dan
>
> ps. Based on this, are you saying i shouldnt apply the "is_active" patch
> you made?

As I said, I'm thinking about providing a more elaborated patch :-)

Michael Storz
--
======================================================
Leibniz-Rechenzentrum      |    <mailto:St...@lr...>
Boltzmannstr. 1            |    Fax: +49 89 35831-9700
85748 Garching / Germany   |    Tel: +49 89 35831-8840
======================================================

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Dan F. <da...@ha...> - 2007-02-28 16:01:29

Michael Storz wrote:
> SELECT 1 FROM from_awl WHERE sender_name = ? AND sender_domain = ? AND src
> = ? AND last_seen > timestamp '2007-02-28 16:11:08' - INTERVAL 36 DAY
>   
Aha.. Interesting. So how about staticly defining "seconds" to be 00 
always. Might be possible, and then the cache would expire only once a 
minute..

- Dan

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Lionel B. <lio...@bo...> - 2007-02-28 16:01:39

Michael Storz wrote the following on 28.02.2007 16:39 :
> On Wed, 28 Feb 2007, Dan Faerch wrote:
>
>   
>> Michael Storz wrote:
>>
>>     
>>> db_cleandelay seconds longer than the strict one. Taking the default this
>>> means it is just half an hour longer, which is at least not noticeable for
>>> the AWLs.
>>>
>>>
>>>       
>> Well. We cant really assume what ppl use for db_cleandelay.. I use 60
>> seconds. Someone might do it once every week. You never know ;).
>>     
>
> We use 1 hour. I think, because as you said, we do not know what people
> are using for db_cleandelay, we should either provide a strict
> implementation or the possibility for the user to choose which
> implementation he wants. As I said in another email, I think I will be
> able to provide a patch for such a choice.
>   

Hum if you can get away without loose, please do so. I consider the 
loose algorithm to be a bug. What I don't like with the possibility to 
choose between loose and strict algorithm is that it will be hard to 
explain to the user what she loses using loose...
I'd prefer users not to have to worry about such details. Unless you can 
demonstrate a significant performance difference between the two, please 
only provide 'strict' support.

>> I dont know excactly how thoose work, but why should there be any
>> difference when the actual where statement doenst change. I dont see
>> anywhere that ie. a timestamp is stuck into the statement. Its all
>> static information and thus, in my head, should still hit the query cache.
>>
>> Most select needs a WHERE statment. Eg. WHERE sender_name =, WHERE src
>> =, ect.ect. So i dont see how adding the max_connect_age,
>> reconnect_delay or awl_age values (which is static) will change how the
>> cache works.
>>
>>     
>
> The query cache uses the select statement as a "text" key. That means,
> using 'select' or 'SELECT' is already a different statement. If we
> implement strict expiration time, then every select of the is_in or count
> subroutines will have a timestamp:
>
> Example from sub is_in_from_awl:
>
>     my $sth = $self->prepare("SELECT 1 FROM $from_awl " .
>                              'WHERE sender_name = ? ' .
>                              'AND sender_domain = ? ' .
>                              'AND src = ? ' .
>                              'AND last_seen > ' .
>                              $self->past_tstamp($self->{sqlgrey}{awl_age},
>                                                 'DAY')
>                             );
> If now() is 2007-02-28 16:41:08, for MySQL you will get the statement
>
> SELECT 1 FROM from_awl WHERE sender_name = ? AND sender_domain = ? AND src
> = ? AND last_seen > timestamp '2007-02-28 16:11:08' - INTERVAL 36 DAY
>
> One second later, now changes to 2007-02-28 16:41:09 and you have a
> textual different select statement. For a loose implementation the part
> 'AND last_seen > timestamp '2007-02-28 16:11:08' - INTERVAL 36 DAY' will
> be ommitted and you always get the same statement (for the same src and
> sender).
>
>   

I don't believe this is a good example, I believe you get a benefit from 
the query cache only if the whole query (not only the prepared 
statement) is exactly the same. So in the example above you'll need to 
have the same timestamp AND the same sender_name, sender_domain, src. 
Losing the timestamp don't bring much benefits, because the rate at 
which the from_awl is updated (entries added, updated or deleted) 
probably is far higher than the rate at which the sender+src comes back.

Lionel

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Michael S. <Mic...@lr...> - 2007-02-28 17:27:45

On Wed, 28 Feb 2007, Lionel Bouton wrote:

>
> I don't believe this is a good example, I believe you get a benefit from
> the query cache only if the whole query (not only the prepared
> statement) is exactly the same. So in the example above you'll need to
> have the same timestamp AND the same sender_name, sender_domain, src.
> Losing the timestamp don't bring much benefits, because the rate at
> which the from_awl is updated (entries added, updated or deleted)
> probably is far higher than the rate at which the sender+src comes back.
>

Lionel,

I am not talking about normal operation. What I am concerned about are
spam attacks, where you have to react as fast as possible.

I just made a statistic from a spam attack a few minutes ago. In a 10
minute frame from 16:10 to 16:20 we got 16296 triplets from 52 ip
addresses, all from the same spammer. This is part of the log for 1 ip
address (I eliminated all attributes other than client_address and
sender):

Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:15:58 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:18:21 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:14:43 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:17:02 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...
Feb 28 16:19:50 lxmhs25 sqlgrey[5736]: request: client_address=124.6.181.182 sen...@ti...

As you can see, the generated senders are used 10 times each. With loose
only the first query will go to from_awl, the other 9 queries will be
answered by the query cache.

On the other side you have to look for throttling. If none of the triplets
is accepted because of an entry in an AWL, 16296 decisions about
throttling have been made.

The code for the decision is:

    # Throttling too many connections from same new host
    if (defined $self->{sqlgrey}{connect_src_throttle} and $self->{sqlgrey}{connect_src_throttle} > 0) {
        if ($self->count_src_connect($cltid) >= $self->{sqlgrey}{connect_src_throttle}
            and $self->count_src_domain_awl($cltid) < 1
            and $self->count_src_from_awl($cltid) < $self->{sqlgrey}{connect_src_throttle}) {
                $self->mylog('grey', 2, "throttling: $cltid($addr), $sender_name\@$sender_domain -> $recipient");
                return ($self->{sqlgrey}{reject_first} . ' Throttling too many connections from new source - ' .
                ' Try again later.');
        }
    }

That means, for every decision 3 count subroutines are executed:

- $self->count_src_connect($cltid)
- $self->count_src_domain_awl($cltid)
- $self->count_src_from_awl($cltid)

In the 10 minute frame 1.475 triplets have been accepted because of
from_awl, domain_awl or reconnect ok, which means every time the query
cache of either from_awl or domain_awl and of connect is emptied.

This means from the nearly 50.000 count queries of the 3 tables only about
3.000 will not be answered by the cache.

The main question still is, is an answer from the cache really much faster
than from a table.

I hope this explains why I am looking into the issues of loose or strict
implementation.

Michael Storz
--
======================================================
Leibniz-Rechenzentrum      |    <mailto:St...@lr...>
Boltzmannstr. 1            |    Fax: +49 89 35831-9700
85748 Garching / Germany   |    Tel: +49 89 35831-8840
======================================================

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Lionel B. <lio...@bo...> - 2007-03-01 14:46:47

Michael Storz wrote the following on 28.02.2007 18:27 :
> On Wed, 28 Feb 2007, Lionel Bouton wrote:
>
>   
>> I don't believe this is a good example, I believe you get a benefit from
>> the query cache only if the whole query (not only the prepared
>> statement) is exactly the same. So in the example above you'll need to
>> have the same timestamp AND the same sender_name, sender_domain, src.
>> Losing the timestamp don't bring much benefits, because the rate at
>> which the from_awl is updated (entries added, updated or deleted)
>> probably is far higher than the rate at which the sender+src comes back.
>>
>>     
>
> Lionel,
>
> I am not talking about normal operation. What I am concerned about are
> spam attacks, where you have to react as fast as possible.
>
>   

I see your point. Smoothing out load spikes is clearly a good goal. I'll 
eagerly await your bench results :-)

Lionel.

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Lionel B. <lio...@bo...> - 2007-02-28 14:25:07

Michael Storz wrote the following on 22.02.2007 10:16 :
> Hi Dan and Lionel,
>
> the last days I was thinking about the performance impact of implementing
> a strict versus a loose expiration time. At the moment, Sqlgrey uses a
> mixed implementation, the is_in_* subroutines use a strict implementation,
> whereas the count subroutines use a loose implementation.
>
> strict means, every select has a condition in the where-clause which
> results in returning only active entries from a table.
>
> loose means, there is no such condition in the where-clause. Expiration is
> handled only by deleting expired entries from a table via a cleanup
> subroutine.
>   

strict seems better to me. It didn't seem to matter much when I coded 
the loose count subroutines, but looking back on the code I don't like 
the fact that db_cleandelay has an impact on the auto-whitelist behaviour.

I vote for the "strict" behaviour.

Lionel

Re: [Sqlgrey-users] Performance: strict versus loose expiration time

From: Michael S. <Mic...@lr...> - 2007-02-28 15:09:10

On Wed, 28 Feb 2007, Lionel Bouton wrote:

> Michael Storz wrote the following on 22.02.2007 10:16 :
> > Hi Dan and Lionel,
> >
> > the last days I was thinking about the performance impact of implementing
> > a strict versus a loose expiration time. At the moment, Sqlgrey uses a
> > mixed implementation, the is_in_* subroutines use a strict implementation,
> > whereas the count subroutines use a loose implementation.
> >
> > strict means, every select has a condition in the where-clause which
> > results in returning only active entries from a table.
> >
> > loose means, there is no such condition in the where-clause. Expiration is
> > handled only by deleting expired entries from a table via a cleanup
> > subroutine.
> >
>
> strict seems better to me. It didn't seem to matter much when I coded
> the loose count subroutines, but looking back on the code I don't like
> the fact that db_cleandelay has an impact on the auto-whitelist behaviour.
>
> I vote for the "strict" behaviour.
>
> Lionel
>

How about this: I think, I would able to deliver a patch which decides at
compile/parse-time if a loose or strict implementation will be used (this
means code for the other possibility will be eliminated at parse time).
This could be triggered by a commandline flag. Default would be strict.
But before I'll do this I have to test if there is any noticeable speed
advantage of having a loose implementation (and this needs some time).

Michael Storz
--
======================================================
Leibniz-Rechenzentrum      |    <mailto:St...@lr...>
Boltzmannstr. 1            |    Fax: +49 89 35831-9700
85748 Garching / Germany   |    Tel: +49 89 35831-8840
======================================================