Re: [Sqlgrey-users] sqlgrey fails when configured with multiple machines

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

> > Okay, maybe your definition of "hang" is different than mine, but
perhaps
>
> By hanging, i mean "any network connection or connection attempt, that
> stalls for more than a few seconds".

Okay, that's how I understand it, but that's not what's happening here.

There are two scenarios where I get the "451 4.3.5 Server configuration
problem" error. The first is if sqlgrey dies on any system, then that
system will respond with the error. The second is when mysql is stopped on
the master server.

(after adding your mysql_connect_timeout=1 option, it no longer fails when
mysql dies.)

However, postfix still responds with "server configuration ..." if sqlgrey
is dead or inaccessible. This is the issue I need to fix now.

> You have this whole chain of individual connections:
>   internet -> postfix -> sqlgrey -> mysql
>
> Each of these have a timeout value. Which doesn't have to be the same.
> So when postfix connects to sqlgrey, its not gonna wait forever for a
> reply. If sqlgrey's attempt to connect to mysql "hangs", for more seconds
> than Postfix is willing to wait, postfix kills the connection and replies
> "Server configuration error".
>
> Thus, if your mysql connection attempt doesn't timeout fast enough,
> sqlgrey never gets a chance to reply "dunno" to postfix and allow the mail
> to go through.

That's assuming sqlgrey is still around to respond.  I need to also
consider the possibility where sqlgrey dies.

> >> db_host = mail02.example.com;mysql_connect_timeout=1
> >>
> >> (same line, no extra spaces) and the restart sqlgrey and see if it
> >> helps.

Okay, this did appear to solve the problem with the master mysqld is not
able to respond. It no longer responds with "Server configuration ...",
which is good.

I don't see that option in the default documentation. Where is this
documented?

> > Please confirm that you think I should do this, given the new
information
> >  about failures above.
>
> Yes. I think you should :). I have tested this with 1.7.4 and 1.8.0 and it
> works in both cases. What I'm doing, is simply adding a connect-timeout of
> 1 second to the mysql connection. So if the connect attempt hangs (as per
> my earlier definition), it will give up after one second. (ofcourse you
> could more seconds than 1, if you worry that your SQL server will ever be
> slower than 1 second to accept a connection).

So then in my setup, where the master mysql daemon is unavailable, each
client references their own database? And no updating is occurring since
they aren't configured as write servers, correct?

> In my tests, this solves the issue, because postfix doesn't have to
> timeout the connection to sqlgrey and everything remains shiny.
>
> (shiny = "mails will pass through unhindered, while the sql-server is
down")

So postfix was always waiting patiently enough; it was sqlgrey that was
responding with failure too quickly?

> > Yes, okay, I do understand that. I should have written that as well, but
> > my main reason is to avoid users from being greylisted numerous times
for
> > sending mail to the same user in the same domain.
>
> For that, you only need 1 sql-server, shared among all mail servers. And
> sqlgrey running with db_cluster=off.
>
> "db_cluster=on" is only needed if the 1 sql-server cant service all your
> mail servers fast enough.
>
> (I'm not saying that you're doing it wrong, I'm just pointing out the
> different motivations.)

So it's okay to leave it on, correct? Wouldn't this also serve to make it
possible for existing entries to be queried through the local copies while
the master is unavailable?

> >> Under normal load, i can easily point all queries to the db-master,
> >> without any problems.  I just tested with db_cluster=off and i can see
> >
> > Okay, but if the master dies, then no queries occur, correct?
>
> Correct. But no queries occur in db_cluster=on mode either, if master
> dies. sqlgrey defaults back to "allow everything" if db_host (the master)
> dies. And as such, there is no need to do queries anymore, until master is
> online again.

Each client has a local copy of the database, no? And by setting read_hosts
to contain at least localhost, it should then be able to query the local
database, no?

> >> read_hosts=localhost prepend = 0 optmethod = optout discrimination = on
> > There are a few options there that I'm not using, and I don't recognize,
> > but I don't believe the lack of any of them would cause the issue I'm
> > having, correct?
>
> No. There are no undocumented settings here, that relates to connections
> to databases. In fact, the only option I'd try to change in your case,
> would be the prepend. Though i doubt it has any effect, it does change the
> way sqlgrey responds to postfix. And if postfix doesn't understand the
> response, you get "Server configuration problem".

I don't see where these options are defined either.

> So now that we know that slaves are just a read only copy of the master,
> and the master is still just a normal mysql-server, i assume you can see
> why disabling db-clustering, wont change anything as long as the master
> doesn't suffer from poor performance. Since, all that happens by setting
> db_cluster=off, is that all the slaves wont be used for reads anymore and
> all read queries will go to the master instead.

Okay, got it. I think I got confused, but I believe I understood it
correctly, in that when the master is down, the slaves can continue to read
from their local database. I think it was just the db_cluster terminology
that I wasn't understanding there.

Thanks again,
Alex