Thread: [Sqlgrey-users] Frequent disconnect/reconnect issues

Brought to you by: gyver, ludvigm, rebum

sqlgrey-users

[Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-11 23:13:38

Hi,

I've been running sqlgrey on fedora for quite a while, set up to use
DB_Cluster with a single master set up with mysql (mariadb). Recently,
I've started to receive a large number of "sqlgrey encountered an SQL
error and triggered a reconnection" messages, followed immediately by
a reconnection message.

It seems to happen at all times during the day, not just when the
network or server is congested or busy.

There are three read_hosts, including the master itself, but one of
the hosts has multiple DNS names, so it may be listed twice.

I don't exactly remember how the database schema was set up, which is
kind of a problem if I ever had to set it up again.

I've included the sqlgrey.conf config below, in hopes someone had some
ideas how to troubleshoot this. This is the config that's been working
fine for more than a year.

# cat sqlgrey.conf |grep '^[a-z|A-Z]'
loglevel = 3
log_override = whitelist:1,grey:3,spam:2
reconnect_delay = 5
awl_age = 31
group_domain_level = 10
db_type = mysql
db_name = sqlgrey
db_host = ns1.example.com;mysql_connect_timeout=1
db_port = default
db_user = sqlgrey
db_pass = mypass
db_cleanup_hostname=ns1.example.com
db_cleandelay = 1800
clean_method = sync
db_cluster = on
read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
prepend = 1
admin_mail = 212...@me...

Thanks,
Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-14 16:58:21

Hi guys,

I sent the message below a few days ago, and haven't received any
responses. I'm still having difficulty with this and really don't know
how to resolve it. It appears to happen to all three systems, two of
which are on the same gigabit network, and the other is on a remote
cable connection (which is the one that least frequently fails).

Would rebuilding the database help? I haven't even been able to find
an error in the logs related to this, only the email that is sent
every time it happens.

Thanks,
Alex


On Tue, Aug 11, 2015 at 7:13 PM, Alex <mys...@gm...> wrote:
> Hi,
>
> I've been running sqlgrey on fedora for quite a while, set up to use
> DB_Cluster with a single master set up with mysql (mariadb). Recently,
> I've started to receive a large number of "sqlgrey encountered an SQL
> error and triggered a reconnection" messages, followed immediately by
> a reconnection message.
>
> It seems to happen at all times during the day, not just when the
> network or server is congested or busy.
>
> There are three read_hosts, including the master itself, but one of
> the hosts has multiple DNS names, so it may be listed twice.
>
> I don't exactly remember how the database schema was set up, which is
> kind of a problem if I ever had to set it up again.
>
> I've included the sqlgrey.conf config below, in hopes someone had some
> ideas how to troubleshoot this. This is the config that's been working
> fine for more than a year.
>
> # cat sqlgrey.conf |grep '^[a-z|A-Z]'
> loglevel = 3
> log_override = whitelist:1,grey:3,spam:2
> reconnect_delay = 5
> awl_age = 31
> group_domain_level = 10
> db_type = mysql
> db_name = sqlgrey
> db_host = ns1.example.com;mysql_connect_timeout=1
> db_port = default
> db_user = sqlgrey
> db_pass = mypass
> db_cleanup_hostname=ns1.example.com
> db_cleandelay = 1800
> clean_method = sync
> db_cluster = on
> read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
> prepend = 1
> admin_mail = 212...@me...
>
> Thanks,
> Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Karl O. P. <ko...@me...> - 2015-08-15 01:32:52

I prefer to stay far away from MySQL/MariaDB.  
Having said that it seems
like the first thing you should do is instrument the db
so that you can tell when connections might be lost for
some reason.  You get the message when a db connection
goes bad for some reason or another.  (For me, it's when
my Postgres DB restarts due to security update.)

You could ask the MariaDB people for help with this.

If you followed the install directions your db was created
the same way as everybody else's.  You could re-create
it the same way, but I don't see a point until you
know what's wrong.

I'd expect sqlgrey to be configured to log to the
mail log, although I suppose yours could be different.
This will give you timestamps to use when looking through
the db log.

On Fri, 14 Aug 2015 12:58:14 -0400
Alex <mys...@gm...> wrote:

> Hi guys,
> 
> I sent the message below a few days ago, and haven't received any
> responses. I'm still having difficulty with this and really don't know
> how to resolve it. It appears to happen to all three systems, two of
> which are on the same gigabit network, and the other is on a remote
> cable connection (which is the one that least frequently fails).
> 
> Would rebuilding the database help? I haven't even been able to find
> an error in the logs related to this, only the email that is sent
> every time it happens.
> 
> Thanks,
> Alex
> 
> 
> On Tue, Aug 11, 2015 at 7:13 PM, Alex <mys...@gm...> wrote:
> > Hi,
> >
> > I've been running sqlgrey on fedora for quite a while, set up to use
> > DB_Cluster with a single master set up with mysql (mariadb).
> > Recently, I've started to receive a large number of "sqlgrey
> > encountered an SQL error and triggered a reconnection" messages,
> > followed immediately by a reconnection message.
> >
> > It seems to happen at all times during the day, not just when the
> > network or server is congested or busy.
> >
> > There are three read_hosts, including the master itself, but one of
> > the hosts has multiple DNS names, so it may be listed twice.
> >
> > I don't exactly remember how the database schema was set up, which
> > is kind of a problem if I ever had to set it up again.
> >
> > I've included the sqlgrey.conf config below, in hopes someone had
> > some ideas how to troubleshoot this. This is the config that's been
> > working fine for more than a year.
> >
> > # cat sqlgrey.conf |grep '^[a-z|A-Z]'
> > loglevel = 3
> > log_override = whitelist:1,grey:3,spam:2
> > reconnect_delay = 5
> > awl_age = 31
> > group_domain_level = 10
> > db_type = mysql
> > db_name = sqlgrey
> > db_host = ns1.example.com;mysql_connect_timeout=1
> > db_port = default
> > db_user = sqlgrey
> > db_pass = mypass
> > db_cleanup_hostname=ns1.example.com
> > db_cleandelay = 1800
> > clean_method = sync
> > db_cluster = on
> > read_hosts=localhost,ns1.example.com,mail02.example.com,mail03.example.com,mail01.example.com
> > prepend = 1
> > admin_mail = 212...@me...
> >
> > Thanks,
> > Alex
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Sqlgrey-users mailing list
> Sql...@li...
> https://lists.sourceforge.net/lists/listinfo/sqlgrey-users
> 





Karl <ko...@me...>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Jean-Michel P. - G. <jm...@go...> - 2015-08-15 11:18:17

Attachments: smime.p7s

Le vendredi 14 août 2015 à 12:58 -0400, Alex a écrit :
> Would rebuilding the database help?

MySQL is dead, has no community, no future (belongs to Oracle)and we cannot loose time debugging which does not and will not ever work. 
Maybe you should make a clean installation of PostgreSQL and use pgAdmin3 GUI to query the database remotely. 
This is a very clean interface which will give all information. 
Kind regards,Jean-Michel

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Karl O. P. <ko...@me...> - 2015-08-15 15:49:49

On Sat, 15 Aug 2015 13:18:02 +0200
Jean-Michel Pouré - GOOZE <jm...@go...> wrote:

> Le vendredi 14 août 2015 à 12:58 -0400, Alex a écrit :
> > Would rebuilding the database help?
> 
> MySQL is dead, has no community, no future (belongs to Oracle)and we
> cannot loose time debugging which does not and will not ever work.

That's not entirely fair.  There is MariaDB, the future of MySQL
moving forward.

I have technical issues with MySQL/MaraiaDB, but putting those aside
the serious problem that has always existed with them is
that their documentation is not free (as in Libre).  A db is no 
good without documentation, which matters if you care
about vendor lock-in.

Karl <ko...@me...>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-15 16:30:28

Hi,

Thanks everyone for your help. It worked with mariadb for months and
months without this problem. I'm going to try and dump the database
and restore it, since that's the easiest.

Thanks again,
Alex

On Sat, Aug 15, 2015 at 11:49 AM, Karl O. Pinc <ko...@me...> wrote:
> On Sat, 15 Aug 2015 13:18:02 +0200
> Jean-Michel Pouré - GOOZE <jm...@go...> wrote:
>
>> Le vendredi 14 août 2015 à 12:58 -0400, Alex a écrit :
>> > Would rebuilding the database help?
>>
>> MySQL is dead, has no community, no future (belongs to Oracle)and we
>> cannot loose time debugging which does not and will not ever work.
>
> That's not entirely fair.  There is MariaDB, the future of MySQL
> moving forward.
>
> I have technical issues with MySQL/MaraiaDB, but putting those aside
> the serious problem that has always existed with them is
> that their documentation is not free (as in Libre).  A db is no
> good without documentation, which matters if you care
> about vendor lock-in.
>
>
>
> Karl <ko...@me...>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Sqlgrey-users mailing list
> Sql...@li...
> https://lists.sourceforge.net/lists/listinfo/sqlgrey-users

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Jean-Michel P. - G. <jm...@go...> - 2015-08-15 17:56:19

Attachments: smime.p7s

Le samedi 15 août 2015 à 10:49 -0500, Karl O. Pinc a écrit :
> There is MariaDB, the future of MySQL
> moving forward.

Quoting website :***© 2015 MariaDB Corporation Ab except as noted. MariaDB is a trademark or registered trademarks of MariaDB Corporation Ab in the European Union and United States of America and/or other countries. This website is proud to run on MariaDB. MySQL is a trademark of Oracle Corporation Inc. Trademarks Statement | Privacy Policy***
IMHO, This is a company picking-up free software, not a real community relying on individuals, like Kernel.org, X.Org, PostgreSQL.org.
If you need reliable solutions, use a software made and released by communities. If you have access to a dedicated server, justinstall SQLgrey and PostgreSQL and you will be able to run thousands of queries per second, reliably.
The big issue with MySQL is that it dies silently and you don't know what happened to you data, because at some pointMySQL is using query caching and you cannot trace.
Anyway, I wron't start a flame war. Go and use MariaDB if you think this is the future. But don't expect to run more than 10.000 queries per second reliably.

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Karl O. P. <ko...@me...> - 2015-08-15 19:17:52

On Sat, 15 Aug 2015 19:56:06 +0200
Jean-Michel Pouré - GOOZE <jm...@go...> wrote:

> If you need reliable solutions,
> use a software made and released by communities.

I would say the following: If you need a reliable, by which
I mean long-term, solution use software which
has a community _and_ can be forked by that community.

Even better, use software with a copyleft license.
Copyleft prevents proprietary interests from taking
the software and competing with the community with
the intent to marginalize it. (E.g. The BSD license
let Sun and later Apple marginalize the BSD based OSs.)

Karl <ko...@me...>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-18 15:58:45

Hi guys,

> I'd expect sqlgrey to be configured to log to the
> mail log, although I suppose yours could be different.
> This will give you timestamps to use when looking through
> the db log.

I've figured out some new information on this problem.

Aug 18 06:06:19 mail01 sqlgrey: warning: Use of uninitialized value
$DBI::errstr in concatenation (.) or string at /usr/sbin/sqlgrey line
250.
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('return','nl2go.com','91.121.21','2015-08-18
05:59:38',NOW()):#012, reconnecting to DB
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Using DBIx:DBCluster
Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Read_hosts: localhost,
ns1.example.com, mail02.example.com, mail03.example.com,
mail01.example.com

Aug 17 01:43:40 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('financeiro','teamplayy.com.br','91.121.21','2015-08-17
01:37:00',NOW()):#012, reconnecting to DB
Aug 17 01:53:29 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shefdemwp','uwerim.ru','91.121.21','2015-08-17
01:46:49',NOW()):#012, reconnecting to DB
Aug 17 05:57:58 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('conor.burrell','businessandfinance.com','91.121.21','2015-08-17
05:51:17',NOW()):#012, reconnecting to DB
Aug 17 09:43:48 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('claudinei.moreira','alper.com.br','91.121.21','2015-08-17
09:37:12',NOW()):#012, reconnecting to DB
Aug 17 10:26:43 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012UPDATE domain_awl SET last_seen = NOW(), first_seen =
first_seen WHERE sender_domain = NULL AND src = NULL:#012Can't connect
to MySQL server on 'ns1.example.com' (111), reconnecting to DB
Aug 17 12:44:38 mail01 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('memberservices=example-marketplace.com__#-21wpwe5yvrgp4h','rlie15guji343ap6.mnrzh7tg8d0vr194.m8vlizz.8-amgceai.na28.bnc.salesforce.com','136.146.208','2015-08-17
12:39:23',NOW()):#012, reconnecting to DB

Our server was changed to "example" in the above. It looks like maybe
some records were corrupted? Multiple entries contain the 91.121.21
address. Should I just truncate all the tables and let them rebuild
themselves with new email data or perhaps there's a more precise fix?

Thanks,
Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-19 17:34:00

Hi guys,

I'm still having database problems. Can someone tell me under what
circumstances the database would expect to see a value that doesn't
exist?

Aug 19 00:07:39 mail02 sqlgrey: dbaccess: warning: couldn't do
query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
first_seen, last_seen)
VALUES('shimvebvl','uwerim.ru','91.121.21','2015-08-19
00:00:57',NOW()):#012, reconnecting to DB

And why this would cause a disconnect/reconnect to occur?

I've searched all three systems in the DB_Cluster, and none of them
have any of these values. Why would sqlgrey think it's there in the
first place?

This appears to be the original occurrence of that entry, just a few
minutes prior:

Aug 19 00:00:57 mail02 postfix/smtpd[24228]: NOQUEUE: reject: RCPT
from webboundaries.com[91.121.21.199]: 450 4.7.1
<lan...@ex...>: Recipient address rejected: Greylisted
for 5 minutes; from=<shi...@uw...>
to=<lan...@ex...> proto=ESMTP helo=<webboundaries.com>

It then failed open, where postfix accepted it and was spamassassin
marked it as spam.

Thanks,
Alex

On Tue, Aug 18, 2015 at 11:58 AM, Alex <mys...@gm...> wrote:
> Hi guys,
>
>> I'd expect sqlgrey to be configured to log to the
>> mail log, although I suppose yours could be different.
>> This will give you timestamps to use when looking through
>> the db log.
>
> I've figured out some new information on this problem.
>
> Aug 18 06:06:19 mail01 sqlgrey: warning: Use of uninitialized value
> $DBI::errstr in concatenation (.) or string at /usr/sbin/sqlgrey line
> 250.
> Aug 18 06:06:19 mail01 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('return','nl2go.com','91.121.21','2015-08-18
> 05:59:38',NOW()):#012, reconnecting to DB
> Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Using DBIx:DBCluster
> Aug 18 06:06:19 mail01 sqlgrey: dbaccess: Read_hosts: localhost,
> ns1.example.com, mail02.example.com, mail03.example.com,
> mail01.example.com
>
> Aug 17 01:43:40 mail02 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('financeiro','teamplayy.com.br','91.121.21','2015-08-17
> 01:37:00',NOW()):#012, reconnecting to DB
> Aug 17 01:53:29 mail01 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('shefdemwp','uwerim.ru','91.121.21','2015-08-17
> 01:46:49',NOW()):#012, reconnecting to DB
> Aug 17 05:57:58 mail02 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('conor.burrell','businessandfinance.com','91.121.21','2015-08-17
> 05:51:17',NOW()):#012, reconnecting to DB
> Aug 17 09:43:48 mail01 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('claudinei.moreira','alper.com.br','91.121.21','2015-08-17
> 09:37:12',NOW()):#012, reconnecting to DB
> Aug 17 10:26:43 mail01 sqlgrey: dbaccess: warning: couldn't do
> query:#012UPDATE domain_awl SET last_seen = NOW(), first_seen =
> first_seen WHERE sender_domain = NULL AND src = NULL:#012Can't connect
> to MySQL server on 'ns1.example.com' (111), reconnecting to DB
> Aug 17 12:44:38 mail01 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('memberservices=example-marketplace.com__#-21wpwe5yvrgp4h','rlie15guji343ap6.mnrzh7tg8d0vr194.m8vlizz.8-amgceai.na28.bnc.salesforce.com','136.146.208','2015-08-17
> 12:39:23',NOW()):#012, reconnecting to DB
>
> Our server was changed to "example" in the above. It looks like maybe
> some records were corrupted? Multiple entries contain the 91.121.21
> address. Should I just truncate all the tables and let them rebuild
> themselves with new email data or perhaps there's a more precise fix?
>
> Thanks,
> Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Karl O. P. <ko...@me...> - 2015-08-19 17:42:38

On Wed, 19 Aug 2015 13:33:49 -0400
Alex <mys...@gm...> wrote:

> Hi guys,
> 
> I'm still having database problems. Can someone tell me under what
> circumstances the database would expect to see a value that doesn't
> exist?
> 
> Aug 19 00:07:39 mail02 sqlgrey: dbaccess: warning: couldn't do
> query:#012INSERT INTO from_awl (sender_name, sender_domain, src,
> first_seen, last_seen)
> VALUES('shimvebvl','uwerim.ru','91.121.21','2015-08-19
> 00:00:57',NOW()):#012, reconnecting to DB
> 
> And why this would cause a disconnect/reconnect to occur?

I believe you have it backwards.  The message
says it can't insert the value.  Because it can't
insert the value you wouldn't expect to see the
value in the db.  It's probably not inserting the
value because it's connection to the db is broken.
So it tries to reconnect to the db.

Something, somewhere, probably not in sqlgrey,
probably not even in perl -- although all these
probablies are just a guess, is breaking the 
connection to the db.

As I mentioned before, the most straightforward
approach is to instrument your db and get logs
of when connections break or otherwise go away.
Your db must know when this happens.

Karl <ko...@me...>
Free Software:  "You don't pay back, you pay forward."
                 -- Robert A. Heinlein

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Lionel B. <lio...@bo...> - 2015-08-19 17:43:38

Hi,

I don't have answers to all your questions but could you explain why you
are using db_cluster and how/why you set the replication to 5 MySQL
databases? This seems overkill and needlessly complex.

Can you reproduce your problem with a single database and db_cluster off?

Best regards,

Lionel

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-19 18:12:28

Hi,

> I don't have answers to all your questions but could you explain why you
> are using db_cluster and how/why you set the replication to 5 MySQL
> databases? This seems overkill and needlessly complex.

It's actually three databases - two of them have multiple CNAMEs that
appear to be chosen on occasion.

I'm using db_cluster because I have three mail servers for the same
domain with the same weight, resulting in a reconnect potentially
reaching a different mail server, otherwise resulting in another 5
minute greylisting delay.

I thought that was the correct way to address this situation?

> Can you reproduce your problem with a single database and db_cluster off?

For the reason described above, I don't think it would be a good idea
to disconnect them from each other.

This configuration seemed to run just fine for months.

There are entries in the mysql error logs when I deliberately
disconnected the master from the slave, but nothing around the time
the dbaccess error was triggered.

Should I look at recording every INSERT in the logs somehow?

Thanks,
Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Lionel B. <lio...@bo...> - 2015-08-19 18:38:51

Le 19/08/2015 20:12, Alex a écrit :
> Hi,
>
>> I don't have answers to all your questions but could you explain why you
>> are using db_cluster and how/why you set the replication to 5 MySQL
>> databases? This seems overkill and needlessly complex.
> It's actually three databases - two of them have multiple CNAMEs that
> appear to be chosen on occasion.

You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
what kind of load distribution between your databases you want. But it's
probably not relevant in your case.

>
> I'm using db_cluster because I have three mail servers for the same
> domain with the same weight,

Do you have system load problems which mandates you having 3 mail
servers? It's usually not a good idea to have several active mail
servers unless you really need them. It's a general recommendation but
in the case of SQLgrey some spammers try to send mails to all servers at
the same time and it can confuse SQLgrey which tries to insert 3 times
the same entry because each instance can't known that the 2 others are
trying too. In this case SQLgrey will handle this as a database error
(because it can't guess if it's not one) and try to reconnect. With a
single server this is not possible.

>  resulting in a reconnect potentially
> reaching a different mail server, otherwise resulting in another 5
> minute greylisting delay.
>
> I thought that was the correct way to address this situation?

No: db_cluster is only meant to solve rare performance issues, has some
drawbacks (replication delay of most replication solutions means slaves
are out of sync with the master which can confuse SQLgrey in some cases)
and requires a bit of DBA work to setup properly. Unless your master
database is so loaded that SQLgrey starts to timeout you shouldn't use
dbcluster.

>> Can you reproduce your problem with a single database and db_cluster off?
> For the reason described above, I don't think it would be a good idea
> to disconnect them from each other.

I don't understand what you mean by "disconnect them from each other".
Disconnecting databases from each other? Disconnecting mail servers from
database servers?

Your database should already be kept in sync by MySQL replication
(although I learned not to trust it myself) so disabling db_cluster
shouldn't have any impact on the database, you will simply write and
read from your single master (ns1.example.com) instead of writing to it
and reading from read_hosts. The slaves won't be used but you can keep
them if you really need to re-enable db_cluster later.
Disconnecting mail servers from the database servers (other than
ns1.example.com) should have zero impact: they are supposed to have the
same data anyway (in fact that's not true there's a replication delay
which can cause problems so disabling db_cluster can actually solve
problems there).

>
> This configuration seemed to run just fine for months.

Sometimes things work by chance even if they aren't set up properly
until the right set of conditions making them fail happen.

Best regards,

Lionel

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Alex <mys...@gm...> - 2015-08-20 01:09:49

Hi,

>>> I don't have answers to all your questions but could you explain why you
>>> are using db_cluster and how/why you set the replication to 5 MySQL
>>> databases? This seems overkill and needlessly complex.
>> It's actually three databases - two of them have multiple CNAMEs that
>> appear to be chosen on occasion.
>
> You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
> what kind of load distribution between your databases you want. But it's
> probably not relevant in your case.

It's not five. It's really only two, unless you count itself. The
other two are localhost and a CNAME entry.

>> I'm using db_cluster because I have three mail servers for the same
>> domain with the same weight,
>
> Do you have system load problems which mandates you having 3 mail
> servers? It's usually not a good idea to have several active mail
> servers unless you really need them.

Yes, for load and redundancy reasons. The three servers receive about
400k messages per day, at peak.

> It's a general recommendation but
> in the case of SQLgrey some spammers try to send mails to all servers at
> the same time and it can confuse SQLgrey which tries to insert 3 times
> the same entry because each instance can't known that the 2 others are
> trying too. In this case SQLgrey will handle this as a database error
> (because it can't guess if it's not one) and try to reconnect. With a
> single server this is not possible.

Okay, that's interesting. And you believe it's possible that's what
happening here?

>>  resulting in a reconnect potentially
>> reaching a different mail server, otherwise resulting in another 5
>> minute greylisting delay.
>>
>> I thought that was the correct way to address this situation?
>
> No: db_cluster is only meant to solve rare performance issues, has some
> drawbacks (replication delay of most replication solutions means slaves
> are out of sync with the master which can confuse SQLgrey in some cases)
> and requires a bit of DBA work to setup properly. Unless your master
> database is so loaded that SQLgrey starts to timeout you shouldn't use
> dbcluster.

Except if the master becomes inaccessible, no?

>>> Can you reproduce your problem with a single database and db_cluster off?
>> For the reason described above, I don't think it would be a good idea
>> to disconnect them from each other.
>
> I don't understand what you mean by "disconnect them from each other".
> Disconnecting databases from each other? Disconnecting mail servers from
> database servers?
>
> Your database should already be kept in sync by MySQL replication
> (although I learned not to trust it myself) so disabling db_cluster
> shouldn't have any impact on the database, you will simply write and
> read from your single master (ns1.example.com) instead of writing to it
> and reading from read_hosts. The slaves won't be used but you can keep
> them if you really need to re-enable db_cluster later.

I think I've been confusing MySQL replication and dbcluster as being
something that must be used together.

> Disconnecting mail servers from the database servers (other than
> ns1.example.com) should have zero impact: they are supposed to have the
> same data anyway (in fact that's not true there's a replication delay
> which can cause problems so disabling db_cluster can actually solve
> problems there).

Okay, great. I'll disable dbcluster and follow up. I was following the README:

    If [mail1] gets a new request, it'll greylist, respond "450" and
stick client into the "connect" table.
    Now heres the problem. Client backs off, and comes back later to
try again, but there is no
    garantee it'll get [mail1] again. Infact, if it doesnt, the
greylisting will happen all over. The solution
    is to use DBClustering. Each mailserver STILL has its own
SQL-server, but we add a master-sql
    and let the local sql-servers be replication slaves:

I don't have the load balancer, as shown in the drawing in the README.
I guess I don't fully understand the purpose of db_cluster.

Thanks so much,
Alex

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Urban, F. <Fra...@co...> - 2015-08-20 05:46:10

Hi Alex,

here my configuring working since 10 years without any problem. I just got new servers and transferred the config and database from Mysql to Mariadb also without any problem.
In the last 365 days Sqlgerey blocked 150.000.000 emails here. It was more in the past up to 1.000.000.000 per year.
We still have only 20% SPAM after Greylisting.

We are using 7 servers. mail1 to mail7
The Mariadb is installed as master on mail6 and uses mail7 as a replication server.
mail6 is used as AWL and Bayes database for SpamAssassin 
mail6 is used as master for Sqlgrey and mail7 as reading host

That's the config used on all servers

/etc/sqlgrey/sqlgrey.conf 

db_host = mail6
db_cluster = on
read_hosts = mail7,mail6
db_cleanup_hostname = mail6


Greetings

Frank

-----Ursprüngliche Nachricht-----
Von: Alex [mailto:mys...@gm...] 
Gesendet: Donnerstag, 20. August 2015 03:10
An: SQLgrey users mailing-list
Betreff: Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

Hi,

>>> I don't have answers to all your questions but could you explain why 
>>> you are using db_cluster and how/why you set the replication to 5 
>>> MySQL databases? This seems overkill and needlessly complex.
>> It's actually three databases - two of them have multiple CNAMEs that 
>> appear to be chosen on occasion.
>
> You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on 
> what kind of load distribution between your databases you want. But 
> it's probably not relevant in your case.

It's not five. It's really only two, unless you count itself. The other two are localhost and a CNAME entry.

>> I'm using db_cluster because I have three mail servers for the same 
>> domain with the same weight,
>
> Do you have system load problems which mandates you having 3 mail 
> servers? It's usually not a good idea to have several active mail 
> servers unless you really need them.

Yes, for load and redundancy reasons. The three servers receive about 400k messages per day, at peak.

> It's a general recommendation but
> in the case of SQLgrey some spammers try to send mails to all servers 
> at the same time and it can confuse SQLgrey which tries to insert 3 
> times the same entry because each instance can't known that the 2 
> others are trying too. In this case SQLgrey will handle this as a 
> database error (because it can't guess if it's not one) and try to 
> reconnect. With a single server this is not possible.

Okay, that's interesting. And you believe it's possible that's what happening here?

>>  resulting in a reconnect potentially reaching a different mail 
>> server, otherwise resulting in another 5 minute greylisting delay.
>>
>> I thought that was the correct way to address this situation?
>
> No: db_cluster is only meant to solve rare performance issues, has 
> some drawbacks (replication delay of most replication solutions means 
> slaves are out of sync with the master which can confuse SQLgrey in 
> some cases) and requires a bit of DBA work to setup properly. Unless 
> your master database is so loaded that SQLgrey starts to timeout you 
> shouldn't use dbcluster.

Except if the master becomes inaccessible, no?

>>> Can you reproduce your problem with a single database and db_cluster off?
>> For the reason described above, I don't think it would be a good idea 
>> to disconnect them from each other.
>
> I don't understand what you mean by "disconnect them from each other".
> Disconnecting databases from each other? Disconnecting mail servers 
> from database servers?
>
> Your database should already be kept in sync by MySQL replication 
> (although I learned not to trust it myself) so disabling db_cluster 
> shouldn't have any impact on the database, you will simply write and 
> read from your single master (ns1.example.com) instead of writing to 
> it and reading from read_hosts. The slaves won't be used but you can 
> keep them if you really need to re-enable db_cluster later.

I think I've been confusing MySQL replication and dbcluster as being something that must be used together.

> Disconnecting mail servers from the database servers (other than
> ns1.example.com) should have zero impact: they are supposed to have 
> the same data anyway (in fact that's not true there's a replication 
> delay which can cause problems so disabling db_cluster can actually 
> solve problems there).

Okay, great. I'll disable dbcluster and follow up. I was following the README:

    If [mail1] gets a new request, it'll greylist, respond "450" and stick client into the "connect" table.
    Now heres the problem. Client backs off, and comes back later to try again, but there is no
    garantee it'll get [mail1] again. Infact, if it doesnt, the greylisting will happen all over. The solution
    is to use DBClustering. Each mailserver STILL has its own SQL-server, but we add a master-sql
    and let the local sql-servers be replication slaves:

I don't have the load balancer, as shown in the drawing in the README.
I guess I don't fully understand the purpose of db_cluster.

Thanks so much,
Alex

------------------------------------------------------------------------------
_______________________________________________
Sqlgrey-users mailing list
Sql...@li...
https://lists.sourceforge.net/lists/listinfo/sqlgrey-users

Re: [Sqlgrey-users] Frequent disconnect/reconnect issues

From: Lionel B. <lio...@bo...> - 2015-08-20 10:33:53

Le 20/08/2015 03:09, Alex a écrit :
> Hi,
>
>>>> I don't have answers to all your questions but could you explain why you
>>>> are using db_cluster and how/why you set the replication to 5 MySQL
>>>> databases? This seems overkill and needlessly complex.
>>> It's actually three databases - two of them have multiple CNAMEs that
>>> appear to be chosen on occasion.
>> You shouldn't use 5 entries in read_hosts but only 2 or 3 depending on
>> what kind of load distribution between your databases you want. But it's
>> probably not relevant in your case.
> It's not five. It's really only two, unless you count itself. The
> other two are localhost and a CNAME entry.

Then you shouldn't have 5 entries in read_hosts as I already said.
Seriously if you don't understand why 5 entries for 3 servers can cause
problems you shouldn't even think about touching db_cluster: you are
clearly using it wrong.

>
>>> I'm using db_cluster because I have three mail servers for the same
>>> domain with the same weight,
>> Do you have system load problems which mandates you having 3 mail
>> servers? It's usually not a good idea to have several active mail
>> servers unless you really need them.
> Yes, for load and redundancy reasons. The three servers receive about
> 400k messages per day, at peak.

This is only <5 messages per second. Hardly something a single server
can't handle (unless you use costly antispam measures that you can't
delegate to backend servers).

This kind of redundancy is mostly superfluous with SMTP: if you have a
spare server ready to replace your mail server (ie: affect it the public
IP of your single MX to it or redirect it if you have some kind of
reverse proxy), you only have to start it to get the messages queued at
the origin delivered to it.

>> It's a general recommendation but
>> in the case of SQLgrey some spammers try to send mails to all servers at
>> the same time and it can confuse SQLgrey which tries to insert 3 times
>> the same entry because each instance can't known that the 2 others are
>> trying too. In this case SQLgrey will handle this as a database error
>> (because it can't guess if it's not one) and try to reconnect. With a
>> single server this is not possible.
> Okay, that's interesting. And you believe it's possible that's what
> happening here?

It's a known problem so yes it's possible.

>
>>>  resulting in a reconnect potentially
>>> reaching a different mail server, otherwise resulting in another 5
>>> minute greylisting delay.
>>>
>>> I thought that was the correct way to address this situation?
>> No: db_cluster is only meant to solve rare performance issues, has some
>> drawbacks (replication delay of most replication solutions means slaves
>> are out of sync with the master which can confuse SQLgrey in some cases)
>> and requires a bit of DBA work to setup properly. Unless your master
>> database is so loaded that SQLgrey starts to timeout you shouldn't use
>> dbcluster.
> Except if the master becomes inaccessible, no?

Even with db_cluster on if the master becomes inaccessible SQLgrey stops
greylisting and let everything pass. If it can't write to the database
there's no way it can do its job.

>
>>>> Can you reproduce your problem with a single database and db_cluster off?
>>> For the reason described above, I don't think it would be a good idea
>>> to disconnect them from each other.
>> I don't understand what you mean by "disconnect them from each other".
>> Disconnecting databases from each other? Disconnecting mail servers from
>> database servers?
>>
>> Your database should already be kept in sync by MySQL replication
>> (although I learned not to trust it myself) so disabling db_cluster
>> shouldn't have any impact on the database, you will simply write and
>> read from your single master (ns1.example.com) instead of writing to it
>> and reading from read_hosts. The slaves won't be used but you can keep
>> them if you really need to re-enable db_cluster later.
> I think I've been confusing MySQL replication and dbcluster as being
> something that must be used together.

I don't know why you think this after our exchanges. They don't have to
work *together* but if the replication doesn't work properly SQLgrey
with db_cluster will surely fail to work properly.

>> Disconnecting mail servers from the database servers (other than
>> ns1.example.com) should have zero impact: they are supposed to have the
>> same data anyway (in fact that's not true there's a replication delay
>> which can cause problems so disabling db_cluster can actually solve
>> problems there).
> Okay, great. I'll disable dbcluster and follow up. I was following the README:
>
>     If [mail1] gets a new request, it'll greylist, respond "450" and
> stick client into the "connect" table.
>     Now heres the problem. Client backs off, and comes back later to
> try again, but there is no
>     garantee it'll get [mail1] again. Infact, if it doesnt, the
> greylisting will happen all over. The solution
>     is to use DBClustering. Each mailserver STILL has its own
> SQL-server, but we add a master-sql
>     and let the local sql-servers be replication slaves:
>
> I don't have the load balancer, as shown in the drawing in the README.
> I guess I don't fully understand the purpose of db_cluster.

db_cluster support wasn't written by me and I never used it so I just
read its README again to check.
I think it doesn't really explain when you should use it properly, just
ignore any advice in it not linked to database load problems.

Best regards,

Lionel