Re: [Sqlgrey-users] sqlgrey fails when configured with multiple machines

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 2014-07-04 16:14, Alex wrote:
> > By hanging, i mean "any network connection or connection attempt, that
> > stalls for more than a few seconds".
>
> Okay, that's how I understand it, but that's not what's happening here.
All evidence so far, points to this explanation. Including the fact that
my timeout-fix worked.
Im unsure what you base your assumption on, that this is not whats
happening, as the logs wont show you this and you'd need to do somthing
like modifying the sqlgrey code to provide you with debugging
information or telnet/netcat to talk to sqlgrey & postfix.

> There are two scenarios where I get the "451 4.3.5 Server
> configuration problem" error. The first is if sqlgrey dies on any
> system, then that system will respond with the error.
> That's assuming sqlgrey is still around to respond.  I need to also
> consider the possibility where sqlgrey dies.

I've never experienced sqlgrey just dying on me, but if it happens, it
is Postfix that decides what to respond. It cannot it be influenced by
sqlgrey.
And the error, 451, is a temporary error, so mails will be delivered
once sqlgrey is running again.

I dont think theres a setting in postfix to choose default answers to
policy_daemon failures. So this will be the same issue with any postfix
policy daemon that isnt running.

> > >> db_host = mail02.example.com
> <http://mail02.example.com>;mysql_connect_timeout=1
>    
> I don't see that option in the default documentation. Where is this
> documented?
Its not an option. Its a hack i made up for this occasion.

Sqlgrey uses a "DSN" internally for connecting to mysql. They look
somthing like this:

DBI:mysql:sqlgrey;host=db.example.com;port=3306

And in sqlgrey $host is just inserted into this DSN somthing like this.
DBI:mysql:sqlgrey;host=$host;port=3306

Which is why, if $host = "127.0.0.2;whatever=3", the DSN will contain
DBI:mysql:sqlgrey;host=127.0.0.2;whatever=3;port=3306

and mysql_connect_timeout happens to be an option you can add to the DSN.
So its just a hack. Its definitely something we should add as an option
in a later version.

> So then in my setup, where the master mysql daemon is unavailable,
> each client references their own database? And no updating is
> occurring since they aren't configured as write servers, correct?
Sqlgrey will default to "accept all mail" when master is unavailable. So
no need to read anything until master is back online.

>
> > In my tests, this solves the issue, because postfix doesn't have to
> > timeout the connection to sqlgrey and everything remains shiny.
> >
> > (shiny = "mails will pass through unhindered, while the sql-server
> is down")
>
> So postfix was always waiting patiently enough; it was sqlgrey that
> was responding with failure too quickly?
No. The other way around. sqlgrey may be 3 minutes in getting a timeout
from its mysql-connect(). But postfix "aint got time for that" and is
disconnecting already after, eg. , 100 seconds. So sqlgrey is too slow
to respond to postfix and postfix just disconnects.  And THATS why you
get "Server configuration problem".

>
> > "db_cluster=on" is only needed if the 1 sql-server cant service all your
> > mail servers fast enough.
> >
> > (I'm not saying that you're doing it wrong, I'm just pointing out the
> > different motivations.)
>
> So it's okay to leave it on, correct?
Yes its fine.
> Wouldn't this also serve to make it possible for existing entries to
> be queried through the local copies while the master is unavailable?
No.. There is no database "high-availability" here. If master dies, all
mail accepted by default.

>
> > dies. And as such, there is no need to do queries anymore, until
> master is
> > online again.
>
> Each client has a local copy of the database, no? And by setting
> read_hosts to contain at least localhost, it should then be able to
> query the local database, no?

In theory, we could query the localhost. But since sqlgrey will
fall-back to to allowing all mails through, it doesnt matter what is in
the database. Since the mail will go through anyway.

And sqlgrey doesnt really work without being able to write, so it
smarter just to accept all mail.

>
> > >> read_hosts=localhost prepend = 0 optmethod = optout
> discrimination = on
>
> I don't see where these options are defined either.
I see all of them, with comments, in the sample config that comes with
sqlgrey-1.8.0. Have a look there and see if not everything is explained.

> > Since, all that happens by setting
> > db_cluster=off, is that all the slaves wont be used for reads
> anymore and
> > all read queries will go to the master instead.
>
> Okay, got it. I think I got confused, but I believe I understood it
> correctly, in that when the master is down, the slaves can continue to
> read from their local database. I think it was just the db_cluster
> terminology that I wasn't understanding there.
Yes. In general (non-sqlgrey) cases, when an sql-master is down, the
application can still read from the slaves. Sqlgrey just doesnt use
this, as sqlgrey NEEDS to be able to write.

Hope that answers everything :)

- Dan