Re: [Sqlgrey-users] Duplicate connect records

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Steffen Plotner wrote the following on 14.09.2005 20:51 :

>Hello Lionel,
>
>Yesterday, I really did have duplications for whatever reason I have not
>determined yet. The duplications I thought I had today were incorrectly
>queried for. The query should have been:
>
>select sender_name, sender_domain, src,
>count(sender_name+sender_domain+src+rcpt) as count 
>from connect 
>group by sender_name, sender_domain, src, rcpt 
>having count(sender_name+sender_domain+src+rcpt) > 1;
>
>Which includes the rcpt field in the group and I have NO duplications.
>
>So, that leaves us with why did I have duplications.
>
>Would you mind explaining the syntax you use in your perl code (I use
>perl a bit, however I admit that I am not totally clear about the syntax
>below:), in particularly the $#$ part, I understand that the $result is
>contains a reference:
>
>    if ($#$result != 0) {
>        return 0; # not a single entry
>    } else {
>        return 1; # at least one entry
>    }
>
>Thinking that it might mean the number of records (?) I wrote a little
>script to check this:
>
>' the query SHOULD return ONE record (at least in my connect table, just
>substitute the value below...
>
>my $sth = $dbconn->prepare("SELECT * FROM connect WHERE sender_name =
>'zwsfkz'");
>
>my $result = $sth->fetchall_arrayref();
>print "$#$result \n";
>
>$sth->execute();
>my $count=0;
>while ( my $row = $sth->fetchrow_hashref() )
>{
>	$count++;
>}
>print "$count\n";
>
>And I find that the first value is always one less than the actual
>count. Hmmm. 
>
>  
>

I just realized that I introduced a bug not so long ago. It seems you 
are the first to be hit and you were on the right track above (the logic 
is now flawed when deciding if an entry in connect must be created).
SQLgrey serializing each message processing the bug can only hit 
installs with multiple greylisting servers: the symptom is mail being 
delayed indefinitely (ouch!). It's a race condition that can be hard to 
hit : there are 2 cases I can think of :
- the other server must try at least two MXs at the same time without 
waiting for the first answer (can't imagine why one normal SMTP server 
would try the same delivery on every MX, only SPAM sources might try that).
- one server (or several in the same class-C) delivers several messages 
from the same source to the same destination at the exact same time 
(given the observed timings, far less than a second apart) and uses 
different MX for that.

Until recently, the put_in_connect function added as a precaution a 
"DELETE FROM connect ..." cleaning entries that might conflict with the 
one being inserted. In order to save a SQL query and speed SQLgrey a 
little bit, I removed some dependencies in the code to the fact that no 
identical entries (save from the timestamp) must be in connect at the 
same time but it seems I forgot to adapt the logic in some places.

Expect a 1.6.6 release with the bugfix shortly tonight.

Lionel.