Re: [Sqlgrey-users] postgery to sqlgrey

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Farkas Levente wrote the following on 12/15/04 13:04 :

>
> - my first assumption if one of the mx can't access to one sql server, 
> then none can do it. otherwise it's a real strange thing in case we 
> can accept no greylisting just dunno.

This is the root of the problem, this assumption is incorrect.

There are several cases where it can happen :
- temporary network link failure (cable unplugged, hardware failing then 
resynchronising) : temporary split of your network,
- SQLgrey automatically reconnects after an error, so if you take the RW 
database down for a short time, some SQLgrey instances will have to 
access the database and some not during this short time. The former ones 
won't be able to reconnect to the database they were using so they will 
look for another. The latter ones *will* be able to reconnect to the 
database.

> - try to use replication between sql servers.

You have to be more precise on this, there are very different 
implementations of replication between databases, from the simple dump 
to file/reload to the Oracle cluster. Each one comes with its advantages 
and limitations, the one you will use will change what the applications 
using the database pool can/cannot do with it.

> - allow write to the slave to and when the master wake up then 
> replicate back the data too.

This won't work : your slave could be used at any moment by a SQLgrey 
which for whatever reason couldn't contact your master : you'll corrupt 
your data.

> - in my case actualy there is no master and slave just there is two 
> sql server with the same database (or almost the same and there are 
> certain point when they are syncing) and there is always one which is 
> rw by all greylist server (first).
>
> imho the greylist database is not so complicated. it's easy to 
> recognize which records should have to replicate. only old/expired 
> record have to delete and always the last updated one is the latest 
> and all record has timestemp (because that's the main purpose the 
> database) so it's easy to know which is the last updated.
>
>> Here are simple questions to make sure we speak of the same things. 
>> Do you agree with the following statements ?
>> - one and only one sql server should accept writes from every SQLgrey 
>> instances. Let's call it the RW server (read-write).
>
>
> no. both, but all greylist server rw one of them at the same time.
>

Won't work as explained above. You can't be sure one SQLgrey instance 
won't fail to contact the database you chose as a master while others 
will. There's no point discussing the rest until you understand this.

Reliable database failover is *hard*, please take the time to understand 
these hard facts :
- there's no affordable database system that allows multiple replicated 
read/write database on the market *yet* (only commercial databases in 
the hundreds of thousands euros/dollars range allow this and they even 
have limitations), you can only bet on master/slaves schemes,
- when using master/slave schemes you *can't* write directly to the 
slaves you must use one and only one database in read/write mode for 
*every* SQL client accessing the database pool,
- you cannot prevent the case where one instance among a pool of SQL 
clients won't be able to contact the "master" server and only this one.

Seriously, what do you find wrong with a take over IP solution ?
Reminder : slave replication in place, master fail, admin scripts detect 
the failure, take IP down on the master's interface and set up the same 
IP on the slave, switching it to read-write mode if needed (depends on 
how the replication work, it might or might not need to put the slave 
database in read-only mode).

This is easily workable as it ensures you can't access 2 databases at 
the same time and SQLgrey will make the take over IP process transparent 
as it will automatically reconnect to the server replacing the failing one.

Best regards,

Lionel.