Re: [Sqlgrey-users] sqlgrey fails when configured with multiple machines

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Alex wrote:
> Okay, maybe your definition of "hang" is different than mine, but perhaps

By hanging, i mean "any network connection or connection attempt, that
stalls for more than a few seconds".

You have this whole chain of individual connections:
  internet -> postfix -> sqlgrey -> mysql

Each of these have a timeout value. Which doesn't have to be the same.
So when postfix connects to sqlgrey, its not gonna wait forever for a
reply. If sqlgrey's attempt to connect to mysql "hangs", for more seconds
than Postfix is willing to wait, postfix kills the connection and replies
"Server configuration error".

Thus, if your mysql connection attempt doesn't timeout fast enough,
sqlgrey never gets a chance to reply "dunno" to postfix and allow the mail
to go through.

>> db_host = mail02.example.com;mysql_connect_timeout=1
>>
>> (same line, no extra spaces) and the restart sqlgrey and see if it
>> helps.
>>
>
> Please confirm that you think I should do this, given the new information
>  about failures above.

Yes. I think you should :). I have tested this with 1.7.4 and 1.8.0 and it
works in both cases. What I'm doing, is simply adding a connect-timeout of
1 second to the mysql connection. So if the connect attempt hangs (as per
my earlier definition), it will give up after one second. (ofcourse you
could more seconds than 1, if you worry that your SQL server will ever be
slower than 1 second to accept a connection).

In my tests, this solves the issue, because postfix doesn't have to
timeout the connection to sqlgrey and everything remains shiny.

(shiny = "mails will pass through unhindered, while the sql-server is down")

> Yes, okay, I do understand that. I should have written that as well, but
> my main reason is to avoid users from being greylisted numerous times for
> sending mail to the same user in the same domain.

For that, you only need 1 sql-server, shared among all mail servers. And
sqlgrey running with db_cluster=off.

"db_cluster=on" is only needed if the 1 sql-server cant service all your
mail servers fast enough.

(I'm not saying that you're doing it wrong, I'm just pointing out the
different motivations.)

>> Under normal load, i can easily point all queries to the db-master,
>> without any problems.  I just tested with db_cluster=off and i can see
>
> Okay, but if the master dies, then no queries occur, correct?

Correct. But no queries occur in db_cluster=on mode either, if master
dies. sqlgrey defaults back to "allow everything" if db_host (the master)
dies. And as such, there is no need to do queries anymore, until master is
online again.

>> read_hosts=localhost prepend = 0 optmethod = optout discrimination = on
> There are a few options there that I'm not using, and I don't recognize,
> but I don't believe the lack of any of them would cause the issue I'm
> having, correct?

No. There are no undocumented settings here, that relates to connections
to databases. In fact, the only option I'd try to change in your case,
would be the prepend. Though i doubt it has any effect, it does change the
way sqlgrey responds to postfix. And if postfix doesn't understand the
response, you get "Server configuration problem".

>> 1 master and many slaves replicating. Each slave lives on the
>> mailserver-node, together with postfix and sqlgrey. All sqlgrey's use
>> localhost for read, master for write.
>>
>
> Ah, I think I have it configured for all hosts to write to the one
> master. How can you have all hosts write to the local database, yet have
> any kind of synchronization between tables?

Hmm.. Let me just explain MySQL Replication real quick:

You have a mysql server. You do reads and writes and everything is fine.
Now you'd like a "replica". So you make a NEW mysql-server, calling it
"slave01". Then you instruct slave01 to "replicate" from the master. The
slave is actually doing all the work, replication wise. The master doesn't
know and doesn't care about how the slave is doing, if its behind or
whatever.
And you can add as more slaves and the master still doesn't know or cares.

The master doesn't know its a master. It doesn't "act" differently. It
still can do reads and write just like when it was stand-alone. Any
statements executed on the master, that would change data in any way, gets
executed on all the slaves as well, via replication.

On the slaves, you can do reads (and technically it can also do writes,
but writing would not be smart, as it causes inconsistencies with the
master and can make replication stop dead).
If writes WERE to be done to a slave, the write changes would NOT be
replicated to the master. Thats simply just not how it works.
The slaves copy all INSERT,REPLACE,UPDATE,DELETE,CREATE,ALTER, ect.
statements from the master and execute them on themselves..

So now you have 1 server where you can read and write all you like, and X
slave servers, that should have the same data as the master, where you can
do read queries.

So now that we know that slaves are just a read only copy of the master,
and the master is still just a normal mysql-server, i assume you can see
why disabling db-clustering, wont change anything as long as the master
doesn't suffer from poor performance. Since, all that happens by setting
db_cluster=off, is that all the slaves wont be used for reads anymore and
all read queries will go to the master instead.

Hope that makes it clearer.

- Dan