From: Michal L. <ml...@lo...> - 2009-08-18 01:58:06
|
Dan Faerch wrote: > There is a bug ill be hunting this week. Its an undocumented feature for > when running multiple servers against one database server. I needed that > only 1 specific server would do the cleanup, so i added (i think in > 1.7.5) the possibillity to add a file called "dont_db_clean" in the > sqlgrey config directory. Ive recently noticed that somehow i run into a > deadlock where nobody cleans the db, but the "last_clean" timestamp > still gets updated. resultning in nothing getting cleaned. In smtpd_access_policy() it does: # Is it time for cleanups ? if ($current_time > ($self->{sqlgrey}{last_dbclean} + $self->{sqlgrey}{db_cleandelay})) { # updateconfig() returns affected_rows 1==> if ($self->updateconfig('last_dbclean', $current_time,$self->{sqlgrey}{last_dbclean})) { # If affected_rows > 0, its my job to clean the db $self->{sqlgrey}{last_dbclean} = $current_time; 2==> $self->start_cleanup(); ... I.e. first it updates the last_dbclean value (#1) and only then calls start_cleanup() (#2) which means last_dbclean is updated even if cleanup failed for some reason. That's clearly a wrong order of actions. Perhaps we could have a "cleanup_in_progress" config option that could be used as a semaphore. Set it atomically with: $rows_affected = $dbi->do(" UPDATE config SET cleanup_in_progress=NOW() WHERE cleanup_in_progress IS NULL"); Then if $rows_affected == 1 we succeeded and can go ahead with cleanup. If $rows_affected == 0 someone else started cleanup in the meantime - no problem, we'll go ahead with other processing. Eventually we could rework the whole cleaning process. Instead of doing (potentially massive) cleanups in fixed time periods there could be a minor cleanup on every, say, 1000th call. No need to count the calls, simple 'if(int(rand(1000))==0) { .. cleanup .. }' should be sufficient. Perhaps have three independent rng calls [cleanup_connect(), cleanup_from_awl(), cleanup_domain_awl()] to spread the load over time even further. The advantage here is that the cleanup will take place at intervals relative to the load of the given server not in fixed time periods that may be difficult to set right. Each of these cleanups should be pretty fast, affecting only a few rows, therefore the problem with multiple sqlgrey processes attempting to cleanup at the same time should go away as well (firstly because they're not likely to trigger cleanup at the very same moment and secondly because the time spent with tables locked should be very short. Then we could get along without last_dbcleanup altogether. Thoughts? Michal -- * http://smtp-cli.logix.cz - the ultimate command line smtp client |