Re: [Sqlgrey-users] [SPAM] Re: Patches for SQLgrey

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dan Faerch wrote:

> There is a bug ill be hunting this week. Its an undocumented feature for 
> when running multiple servers against one database server. I needed that 
> only 1 specific server would do the cleanup, so i added (i think in 
> 1.7.5) the possibillity to add a file called "dont_db_clean" in the 
> sqlgrey config directory. Ive recently noticed that somehow i run into a 
> deadlock where nobody cleans the db, but the "last_clean" timestamp 
> still gets updated. resultning in nothing getting cleaned.

In smtpd_access_policy() it does:

    # Is it time for cleanups ?
    if ($current_time > ($self->{sqlgrey}{last_dbclean} +
	$self->{sqlgrey}{db_cleandelay})) {
            # updateconfig() returns affected_rows
1==>        if ($self->updateconfig('last_dbclean',
		$current_time,$self->{sqlgrey}{last_dbclean})) {
                  # If affected_rows > 0, its my job to clean the db
                  $self->{sqlgrey}{last_dbclean} = $current_time;
2==>              $self->start_cleanup();
	...

I.e. first it updates the last_dbclean value (#1) and only then calls
start_cleanup() (#2) which means last_dbclean is updated even if cleanup
failed for some reason. That's clearly a wrong order of actions.

Perhaps we could have a "cleanup_in_progress" config option that could
be used as a semaphore. Set it atomically with:
	$rows_affected = $dbi->do("
		UPDATE config
			SET cleanup_in_progress=NOW()
			WHERE cleanup_in_progress IS NULL");
Then if $rows_affected == 1 we succeeded and can go ahead with cleanup.
If $rows_affected == 0 someone else started cleanup in the meantime - no
problem, we'll go ahead with other processing.

Eventually we could rework the whole cleaning process. Instead of doing
(potentially massive) cleanups in fixed time periods there could be a
minor cleanup on every, say, 1000th call. No need to count the calls,
simple 'if(int(rand(1000))==0) { .. cleanup .. }' should be sufficient.
Perhaps have three independent rng calls [cleanup_connect(),
cleanup_from_awl(), cleanup_domain_awl()] to spread the load over time
even further.

The advantage here is that the cleanup will take place at intervals
relative to the load of the given server not in fixed time periods that
may be difficult to set right. Each of these cleanups should be pretty
fast, affecting only a few rows, therefore the problem with multiple
sqlgrey processes attempting to cleanup at the same time should go away
as well (firstly because they're not likely to trigger cleanup at the
very same moment and secondly because the time spent with tables locked
should be very short. Then we could get along without last_dbcleanup
altogether.

Thoughts?

Michal
-- 
* http://smtp-cli.logix.cz - the ultimate command line smtp client