You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(10) |
Nov
(37) |
Dec
(66) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(52) |
Feb
(136) |
Mar
(65) |
Apr
(38) |
May
(46) |
Jun
(143) |
Jul
(60) |
Aug
(33) |
Sep
(79) |
Oct
(29) |
Nov
(13) |
Dec
(14) |
2006 |
Jan
(25) |
Feb
(26) |
Mar
(4) |
Apr
(9) |
May
(29) |
Jun
|
Jul
(9) |
Aug
(11) |
Sep
(10) |
Oct
(9) |
Nov
(45) |
Dec
(8) |
2007 |
Jan
(82) |
Feb
(61) |
Mar
(39) |
Apr
(7) |
May
(9) |
Jun
(16) |
Jul
(2) |
Aug
(22) |
Sep
(2) |
Oct
|
Nov
(4) |
Dec
(5) |
2008 |
Jan
|
Feb
|
Mar
(5) |
Apr
(2) |
May
(8) |
Jun
|
Jul
(10) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
(32) |
May
|
Jun
(7) |
Jul
|
Aug
(38) |
Sep
(3) |
Oct
|
Nov
(4) |
Dec
|
2010 |
Jan
(36) |
Feb
(32) |
Mar
(2) |
Apr
(19) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
(8) |
Dec
|
2011 |
Jan
(3) |
Feb
|
Mar
(5) |
Apr
|
May
(2) |
Jun
(1) |
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
(6) |
2012 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(6) |
Dec
(10) |
2014 |
Jan
(8) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
(34) |
Aug
(6) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(18) |
Jul
(13) |
Aug
(30) |
Sep
(4) |
Oct
(1) |
Nov
|
Dec
(4) |
2016 |
Jan
(2) |
Feb
(10) |
Mar
(3) |
Apr
|
May
|
Jun
(11) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Klaus A. S. <kse...@gm...> - 2004-12-19 19:37:59
|
Hi, Just wondering . . . My primary MX, mx.szn.dk, accepts connections on both IPv4 and IPv6. How does SQLgrey handle IPv6 addresses? Would anyone with a IPv6 capable MTA care to test mail me at kl...@se... so I can see how SQLgrey reacts? Thanks. Cheers, --=20 Klaus Alexander Seistrup SubZeroNet =B7 Copenhagen =B7 Denmark |
From: Rene J. <rg...@ba...> - 2004-12-18 23:24:19
|
On Fri, Dec 17, 2004 at 05:04:44PM +0100, Rene Joergensen wrote: > When index is needed on from_awl i'll return with more results. Sorry, should have checked before writing that, indexes already exist on from_awl and domain_awl. --=20 -Ren=E9 |
From: Rene J. <rg...@ba...> - 2004-12-17 16:04:53
|
On Thu, Dec 16, 2004 at 02:44:22PM +0100, Lionel Bouton wrote: > If you could test how much time each index on the connect table saves,=20 > that would help me decide which ones to create. I've just put it into production :o) Commented out the cleanup code and wrote a small perlscript to maintain the db until v. 1.4.1 With around 7000 rows in connect mysqld started hovering around 80-90% CPU (Dual 2.6GHz Xeon with 1GB RAM and 2 SCSI-disks), after putting index on ip_addr (alter table connect add index(ip_addr)) mysqld now uses around 2% CPU. Before: mysql> SELECT 1 FROM connect WHERE sender_name =3D 'rgj' AND sender_domai= n =3D 'bananas.dk' AND ip_addr =3D '194.255.237' AND rcpt =3D 'just@biteme.= dk' AND first_seen BETWEEN now() - INTERVAL 24 HOUR AND now() - INTERVAL 5 MINUTE ; +---+ | 1 | +---+ | 1 | +---+ 1 row in set (0.03 sec) After: +---+ | 1 | +---+ | 1 | +---+ 1 row in set (0.00 sec) mysql> SELECT count(*) FROM connect WHERE first_seen < NOW() - INTERVAL 24 HOUR; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.05 sec) mysql> alter table connect add index(first_seen); Query OK, 17121 rows affected (0.23 sec) Records: 17121 Duplicates: 0 Warnings: 0 mysql> SELECT count(*) FROM connect WHERE first_seen < NOW() - INTERVAL 24 HOUR; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec) When index is needed on from_awl i'll return with more results. --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2004-12-17 14:44:57
|
Rene Joergensen wrote the following on 12/17/04 15:26 : >On Fri, Dec 17, 2004 at 01:32:35PM +0100, Lionel Bouton wrote: > > > >>Because 1.4 should be for bugfixes and really simple additions only. >>For example maint_delay set to 0 is a bug, it will be fixed in 1.4.1. >>Allowing whitelists reload from a central server is not intrusive : it >>will be done in 1.4.x. >> >> > >When is 1.4.1 due to be released? I've looked in CVS, and it doesn't >look like the bugfix is committed yet. Friday afternoon is the best >time for implementing new solutions. ;) > > You'll have to wait a little. I have guests this evening and am switching from keyboard to cooking tools ! |
From: Rene J. <rg...@ba...> - 2004-12-17 14:26:57
|
On Fri, Dec 17, 2004 at 01:32:35PM +0100, Lionel Bouton wrote: > Because 1.4 should be for bugfixes and really simple additions only. > For example maint_delay set to 0 is a bug, it will be fixed in 1.4.1.=20 > Allowing whitelists reload from a central server is not intrusive : it=20 > will be done in 1.4.x. When is 1.4.1 due to be released? I've looked in CVS, and it doesn't look like the bugfix is committed yet. Friday afternoon is the best time for implementing new solutions. ;) > I have already considered dropping this part of the statement (with the= =20 > same remarks in mind), but I didn't do it already for the following=20 > reasons : > - doing so will make the results depend on the cleanup which might very= =20 > well become a separate process which might have to handle subtle=20 > problems when you have several SQLgrey instance accessing the same data= base, The cleaning that is performed at the moment isn't very complex, and if it doesn't change, i don't think there should be any problems in dropping the statement. > - I'm not sur this will save much time, remember that SQL doesn't slow=20 > down much when performing computations, it gets slow when it involves=20 > reading a lot of rows, sorting results that don't fit in memory or=20 > perform some kinds of joins between tables. If there's an index on the columns in the select the gain is probably minimal. But the best way is to get some data to perform tests on. [...] > Hope my explanation was enough to convince you it's not the case, it's = a=20 > subtle problem. My approach was that only one SQLgrey instance was running, as the case is in our setup with postgrey (and will be with SQLgrey). --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2004-12-17 12:33:28
|
Rene Joergensen wrote the following on 12/17/04 11:42 : >[...] > > >>Currently I'm inclined to make the interval configurable in 1.4 with a >>more sensible default that "0" and store a last_cleanup timestamp in >>database in 1.6 to easily support multiple instances accessing the same >>database. >> >> > >Why wait until 1.6? Currently you got the config-table which could be >fine for storing a timestamp. > > Because 1.4 should be for bugfixes and really simple additions only. For example maint_delay set to 0 is a bug, it will be fixed in 1.4.1. Allowing whitelists reload from a central server is not intrusive : it will be done in 1.4.x. But for new functions that involve more than a few lines of code, I'll start the 1.5.x dev branch. People that don't want to test new code and want more reliable software will use 1.4.x updates, others will test the 1.5.x releases until they stabilize and 1.6.0 comes out... > > >>Yep, the order of the checks is designed to minimize the number of >>queries done to the database. >> >> > >A couple of comments: > >When checking domain_awl and from_awl, which both is cleaned reguarly, >there is no need for "AND last_seen > now() - INTERVAL 60 DAY". It >doesn't matter if it's 60 days + 30 minutes, and it'll make the query a >little lighter. > > I have already considered dropping this part of the statement (with the same remarks in mind), but I didn't do it already for the following reasons : - doing so will make the results depend on the cleanup which might very well become a separate process which might have to handle subtle problems when you have several SQLgrey instance accessing the same database, - I'm not sur this will save much time, remember that SQL doesn't slow down much when performing computations, it gets slow when it involves reading a lot of rows, sorting results that don't fit in memory or perform some kinds of joins between tables. I might drop this condition in the future given the following conditions : - tests show there is a gain in performance, - I've clear ideas on how the cleanup process will shape up. >First time a triplet is seen a delete is still performed on the >connect-table to erase an entry that doesn't exist, > In fact it could exist, but the way SQLgrey handle this is not clean (I couldn't find any way to make it so). Let me explain. SQLgrey is coded with multiple instances accessing the same database in mind. So nothing prevents another instance adding a connect entry that will clash with the one this instance wants to insert, just imagine 2 MX receiving 2 mails with the same sender/recipient/clientIP, the corresponding SQLgrey instances will both try to add a very similar entry to the connect table (only the timestamp can change). If they do it at the same time, there's no way to ensure the code can detect it reliably (MySQL doesn't support PRIMARY KEYs large enough, this is one of the little annoyances that usually make me prefer PostgreSQL). So SQLgrey tries to reduce the window as much as possible by DELETING just before INSERTING. Having 2 entries will bite back when trying to match a reconnection against the earlier connection, SQLgrey doesn't like the fact that there could be 2 entries in the table for the same connection. In fact by rereading the whole code involved, there's a bug : SQLgrey should allow several entries and if one matches, allow the message to pass (the consistency checks done by SQLgrey wrongly assume there is a PRIMARY KEY). There will still be problems when computing the reconnection delay, but there's not much we can do about it. The window for this bug is really small and can only happen on multiple SQLgrey instances installations, but it can happen (just added this bug to my TODO for 1.4.x). > maybe the result of >the previous check in the connect-table could be used to determine if a >delete is necessary before inserting the row. > > Hope my explanation was enough to convince you it's not the case, it's a subtle problem. >Maybe a auto-incrementing key could be implemented on the connect-table? >When the first select is performed on connect-table, the id-key could be >returned, making deletes easier for the db. > > > I'm not sure I understand. Which delete do you want to speed up ? The one that occurs on the connect table when the reconnection succeeds ? This can be done, but there won't be a way to check that another connect entry was added *while* the auto-whitelisting process was going on. You will have an entry in connect that will only be removed by the "cleanup" functions that will report a false positive (as the reconnect will match the auto-whitelist instead of the connect entry, it will remain here). >>Another entry in my TODO list... Probably for the 1.4 releases. >> >> > >Sounds like a good idea, maybe i'll look at coding something like that, >if i get the time. > > > That would be a great addition I'd be thankful for, Lionel. |
From: Rene J. <rg...@ba...> - 2004-12-17 10:42:29
|
On Thu, Dec 16, 2004 at 02:44:22PM +0100, Lionel Bouton wrote: > As SQLgrey is required to verify the database version at startup (and i= f=20 > you look at the code closely, you'll see that I paid in messy code the=20 > mistake of not storing the layout version from the start) I'm inclined=20 > to let SQLgrey handle the upgrade too. I still think that it's the admins job to upgrade/maintain the db, but if that's how you like it, that's the way it's got to be :-) > conversion, only supporting v1 to v2 conversion. In the less frequent=20 > case of a layout at version 0 SQLgrey will "die" with a message advisin= g=20 > to use an external script to upgrade from version 0 to 1. Why not do the same thing in any case (in case of version 1 when version 2 is needed) > >- Timed interval, eg. once every 2 hours. > Supported in the code. Deactivated by mistake. Yeah, i can see it's back in the soon to come v. 1.4.1. > Currently I'm inclined to make the interval configurable in 1.4 with a=20 > more sensible default that "0" and store a last_cleanup timestamp in=20 > database in 1.6 to easily support multiple instances accessing the same= =20 > database. Why wait until 1.6? Currently you got the config-table which could be fine for storing a timestamp. > Yep, the order of the checks is designed to minimize the number of=20 > queries done to the database. A couple of comments: When checking domain_awl and from_awl, which both is cleaned reguarly, there is no need for "AND last_seen > now() - INTERVAL 60 DAY". It doesn't matter if it's 60 days + 30 minutes, and it'll make the query a little lighter. First time a triplet is seen a delete is still performed on the connect-table to erase an entry that doesn't exist, maybe the result of the previous check in the connect-table could be used to determine if a delete is necessary before inserting the row. Maybe a auto-incrementing key could be implemented on the connect-table? When the first select is performed on connect-table, the id-key could be returned, making deletes easier for the db. > Wise decision. I would have reports of users with your level of trafic = I=20 > could have said the default SQLgrey install would (not) have been suite= d=20 > for you, but you may be the first one trying with this kind of trafic,=20 > so it's better to be able to monitor the sytem for at least the first d= ay. Well, I found a open access-point yesterday, so maybe on monday :o) > Another entry in my TODO list... Probably for the 1.4 releases. Sounds like a good idea, maybe i'll look at coding something like that, if i get the time. --=20 -Ren=E9 |
From: David R. <dr...@gr...> - 2004-12-16 19:45:28
|
Lionel Bouton wrote: > > Remember that even with Multiplex you can fork early to launch another > process which can do the asynchronous cleanups. You won't have to > maintain cron entries this way. Another good idea, I like it better than crontab entries. -Dave |
From: Lionel B. <lio...@bo...> - 2004-12-16 16:20:40
|
David Rees wrote the following on 12/16/04 16:56 : > Rene Joergensen wrote, On 12/16/2004 3:25 AM: > >> - Cleaning from cron, calling sqlgrey with --db-clean (or seperate >> perlscript (but that would mean more code to maintain)), this is in my >> eyes a pretty flexible way, and non-blocking without rewriting SQLgrey >> to use Net::Server::Prefork, and then just tell SQLgrey that cleaning >> is done externally, and it doesn't have to worry about it. Default >> should be one of the other ways, to avoid people who doesn't RTFM >> complaining about large databases. > Remember that even with Multiplex you can fork early to launch another process which can do the asynchronous cleanups. You won't have to maintain cron entries this way. > > I think this one is a great idea. I don't see the need for cleaning > out the database maybe more than once every hour or so, and being able > to move the cleanup process outside of the main sqlgrey process will > reduce the latency hit of having the daemon itself do cleanup. Not a > big deal in your typical setup, but it will make a big difference in > high volume setups. Sure. I wonder how SQLite will react to this though... I'm not sure it won't lock the whole table or worse the database when doing the DELETE. That will be 1.6 material, I need to think about this a little more and test things. In the mean time 1.4.1 will be released with a configurable "cleanup_delay" (better describes what the current maint_delay internal variable does) with a default set to 30 minutes. Best regards, Lionel. |
From: David R. <dr...@gr...> - 2004-12-16 15:56:52
|
Rene Joergensen wrote, On 12/16/2004 3:25 AM: > - Cleaning from cron, calling sqlgrey with --db-clean (or seperate > perlscript (but that would mean more code to maintain)), this is in my > eyes a pretty flexible way, and non-blocking without rewriting SQLgrey > to use Net::Server::Prefork, and then just tell SQLgrey that cleaning > is done externally, and it doesn't have to worry about it. Default > should be one of the other ways, to avoid people who doesn't RTFM > complaining about large databases. I think this one is a great idea. I don't see the need for cleaning out the database maybe more than once every hour or so, and being able to move the cleanup process outside of the main sqlgrey process will reduce the latency hit of having the daemon itself do cleanup. Not a big deal in your typical setup, but it will make a big difference in high volume setups. -Dave |
From: Lionel B. <lio...@bo...> - 2004-12-16 13:45:50
|
Rene Joergensen wrote the following on 12/16/04 12:25 : >Hi there. > >I'm working at an ISP, currently using Postgrey, but looking for an >alternative due to the limitations in Postgrey, cleaning db takes 5-10 >minutes each night, in which Postfix is rejecting legitimate mail with >450 errors. > >I've seen you expressing worries about the current size of SQLgrey, a >place to start would be putting DB creation/alteration into a seperate >perl-script, it's only done once, and running the seperate perlscript >on installation/upgrade doesn't take much effort. Not much performance >gained, but it sure would be prettier :) > > As SQLgrey is required to verify the database version at startup (and if you look at the code closely, you'll see that I paid in messy code the mistake of not storing the layout version from the start) I'm inclined to let SQLgrey handle the upgrade too. What I was planning is to move out old update code. When SQLgrey 1.6 will be out, the database layout will probably at version 2 instead of 1 today in 1.4. I will then drop support for version 0 to version 1 conversion, only supporting v1 to v2 conversion. In the less frequent case of a layout at version 0 SQLgrey will "die" with a message advising to use an external script to upgrade from version 0 to 1. >We're currently receiving around 1.2 million mails a day, which Postgrey >handles fine. But i like the idea about the auto-whitelist in SQLgrey, >and the possibility of getting rid of the 5-10 minutes downtime a day. >The ongoing cleaning that SQLgrey does (3 deletes pr. mail iirc) would >mean 3+ million extra queries a day, which is quite a lot. > >Cleaning could be done in 3 (or more) ways: > >- Timed interval, eg. once every 2 hours. > > Supported in the code. Deactivated by mistake. >- Interval based on queries, eg. once every 2000 queries > > Maybe. I'll have to think about that one. >- Cleaning from cron, calling sqlgrey with --db-clean (or seperate > perlscript (but that would mean more code to maintain)), this is in my > eyes a pretty flexible way, and non-blocking without rewriting SQLgrey > to use Net::Server::Prefork, and then just tell SQLgrey that cleaning > is done externally, and it doesn't have to worry about it. Default > should be one of the other ways, to avoid people who doesn't RTFM > complaining about large databases. > > Currently I'm inclined to make the interval configurable in 1.4 with a more sensible default that "0" and store a last_cleanup timestamp in database in 1.6 to easily support multiple instances accessing the same database. >Limiting the number of queries is always good, haven't really looked >much into the queries, and i'm sure you're doing what you can to limit >them. > > > Yep, the order of the checks is designed to minimize the number of queries done to the database. >The missing indexes could also become a problem as the connect-table >probably will contain 500,000-1,000,000 rows, which is the current >values for Postgrey, and the from_awl contains around 700,000 records. >These values will probably be smaller when using autowhitelisting, but >the tables will still need indexes to avoid full tablescans. > > If you could test how much time each index on the connect table saves, that would help me decide which ones to create. See the FAQ file for questions I'm waiting answers for in order to decide on the indices. The db_performance_reports present the only report I have (not installed, you'll have to fetch it from the tar.bz2 archive or the CVS tree). >I've been close to testing it in production a couple of times, but >chickened out each time as i'm in the process of moving and therefore is >unable to keep an eye on it due to lack of IP in the new apartment. > > Wise decision. I would have reports of users with your level of trafic I could have said the default SQLgrey install would (not) have been suited for you, but you may be the first one trying with this kind of trafic, so it's better to be able to monitor the sytem for at least the first day. I think I'll end up writing some perl scripts that simulate mail trafic by calling the policy daemon with adjustable trafic patterns (% of non reconnecting zombies, distribution of the reconnect delay of real MTAs, distribution of number of source IPs for each domain, distribution of number of e-mail addresses per domain for example) and rates. This should help people test SQLgrey *before* starting to greylist with it. It will help me fine tune indices on different databases too. Another entry in my TODO list... Probably for the 1.4 releases. Best regards, Lionel. |
From: Rene J. <rg...@ba...> - 2004-12-16 11:26:04
|
Hi there. I'm working at an ISP, currently using Postgrey, but looking for an alternative due to the limitations in Postgrey, cleaning db takes 5-10 minutes each night, in which Postfix is rejecting legitimate mail with 450 errors. I've seen you expressing worries about the current size of SQLgrey, a place to start would be putting DB creation/alteration into a seperate perl-script, it's only done once, and running the seperate perlscript on installation/upgrade doesn't take much effort. Not much performance gained, but it sure would be prettier :) We're currently receiving around 1.2 million mails a day, which Postgrey handles fine. But i like the idea about the auto-whitelist in SQLgrey, and the possibility of getting rid of the 5-10 minutes downtime a day. The ongoing cleaning that SQLgrey does (3 deletes pr. mail iirc) would mean 3+ million extra queries a day, which is quite a lot. Cleaning could be done in 3 (or more) ways: - Timed interval, eg. once every 2 hours. - Interval based on queries, eg. once every 2000 queries - Cleaning from cron, calling sqlgrey with --db-clean (or seperate perlscript (but that would mean more code to maintain)), this is in my eyes a pretty flexible way, and non-blocking without rewriting SQLgrey to use Net::Server::Prefork, and then just tell SQLgrey that cleaning is done externally, and it doesn't have to worry about it. Default should be one of the other ways, to avoid people who doesn't RTFM complaining about large databases. Limiting the number of queries is always good, haven't really looked much into the queries, and i'm sure you're doing what you can to limit them. The missing indexes could also become a problem as the connect-table probably will contain 500,000-1,000,000 rows, which is the current values for Postgrey, and the from_awl contains around 700,000 records. These values will probably be smaller when using autowhitelisting, but the tables will still need indexes to avoid full tablescans. I've been close to testing it in production a couple of times, but chickened out each time as i'm in the process of moving and therefore is unable to keep an eye on it due to lack of IP in the new apartment. --=20 -Ren=E9 |
From: Farkas L. <lf...@bp...> - 2004-12-15 13:42:48
|
reply to my own mail:-( Farkas Levente wrote: > imho the greylist database is not so complicated. it's easy to recognize > which records should have to replicate. only old/expired record have to > delete and always the last updated one is the latest and all record has > timestemp (because that's the main purpose the database) so it's easy to > know which is the last updated. just another tip, may be there are better ideas; one solution to create a connetion to all (both if there is only two) sql server. when you lookup for one triplet you can do it on all sql server and use the latest ie. the one with the smallest value. update only one sql server (probably the first in this case everybody update the same server most case). in the cleanup step (or in another scheduled time eg. hourly) you can merge/replicate the sql server before delete (update to the latest triplet value on all sql server). so you don't need mysql replication you can do the whole thing in the sqlgrey server. -- Levente "Si vis pacem para bellum!" |
From: Lionel B. <lio...@bo...> - 2004-12-15 13:33:29
|
Farkas Levente wrote the following on 12/15/04 13:04 : > > - my first assumption if one of the mx can't access to one sql server, > then none can do it. otherwise it's a real strange thing in case we > can accept no greylisting just dunno. This is the root of the problem, this assumption is incorrect. There are several cases where it can happen : - temporary network link failure (cable unplugged, hardware failing then resynchronising) : temporary split of your network, - SQLgrey automatically reconnects after an error, so if you take the RW database down for a short time, some SQLgrey instances will have to access the database and some not during this short time. The former ones won't be able to reconnect to the database they were using so they will look for another. The latter ones *will* be able to reconnect to the database. > - try to use replication between sql servers. You have to be more precise on this, there are very different implementations of replication between databases, from the simple dump to file/reload to the Oracle cluster. Each one comes with its advantages and limitations, the one you will use will change what the applications using the database pool can/cannot do with it. > - allow write to the slave to and when the master wake up then > replicate back the data too. This won't work : your slave could be used at any moment by a SQLgrey which for whatever reason couldn't contact your master : you'll corrupt your data. > - in my case actualy there is no master and slave just there is two > sql server with the same database (or almost the same and there are > certain point when they are syncing) and there is always one which is > rw by all greylist server (first). > > imho the greylist database is not so complicated. it's easy to > recognize which records should have to replicate. only old/expired > record have to delete and always the last updated one is the latest > and all record has timestemp (because that's the main purpose the > database) so it's easy to know which is the last updated. > >> Here are simple questions to make sure we speak of the same things. >> Do you agree with the following statements ? >> - one and only one sql server should accept writes from every SQLgrey >> instances. Let's call it the RW server (read-write). > > > no. both, but all greylist server rw one of them at the same time. > Won't work as explained above. You can't be sure one SQLgrey instance won't fail to contact the database you chose as a master while others will. There's no point discussing the rest until you understand this. Reliable database failover is *hard*, please take the time to understand these hard facts : - there's no affordable database system that allows multiple replicated read/write database on the market *yet* (only commercial databases in the hundreds of thousands euros/dollars range allow this and they even have limitations), you can only bet on master/slaves schemes, - when using master/slave schemes you *can't* write directly to the slaves you must use one and only one database in read/write mode for *every* SQL client accessing the database pool, - you cannot prevent the case where one instance among a pool of SQL clients won't be able to contact the "master" server and only this one. Seriously, what do you find wrong with a take over IP solution ? Reminder : slave replication in place, master fail, admin scripts detect the failure, take IP down on the master's interface and set up the same IP on the slave, switching it to read-write mode if needed (depends on how the replication work, it might or might not need to put the slave database in read-only mode). This is easily workable as it ensures you can't access 2 databases at the same time and SQLgrey will make the take over IP process transparent as it will automatically reconnect to the server replacing the failing one. Best regards, Lionel. |
From: Lionel B. <lio...@bo...> - 2004-12-15 12:44:39
|
Josh Endries wrote the following on 12/15/04 09:29 : > [...] > | Conclusion : no greylisting before alias expansion Oups, typo. I meant *after* of course. > > I'm confused. First you said it shouldn't be possible to greylist > after expansion, then you said no greylisting before expansion. I'm > guessing Postfix will do the alias resolution before policy, as > different "real" users may have different policies, but that's just > a hunch. I can test this to find out what happens. |
From: Farkas L. <lf...@bp...> - 2004-12-15 12:04:44
|
Lionel Bouton wrote: > In the case of one sqlgrey instance on each mx even if you don't have > *any* slave, as I explained you won't have any single point of failure > (you won't have any greylisting when the database wil go down though). you've got right, so i wouldn't like failure neither postfix without greylist:-) >> [...] - that's why there is slave ldap servers, > > > > The devil is in the details. If I understand correctly, you want SQLgrey > instances to know a slave or list of slaves and fallback automatically > to the slave or one of the slaves if the master doesn't answer. > What I don't know is how you will solve the following problems related > to the writes done to the database by SQLgrey : > - one SQLgrey can't access the master due to a temporary link problem > and switch to the slave and try to update its database content although > the slave won't authorize it (you can't allow writes to a slave or > you'll end up with consistency problems or PRIMARY KEY collisions as I > explained in a precedent message), should it scan the list of servers > until it finds the one accepting writes ? > - in case of multiple slaves, how do you make every SQLgrey instances > aware of the one among them becoming the new master. first of all i'd be happy with one master and one slave. second i don't know the right solution just that what i would like to see. and i talk about it a few other bigger site sysadm and i collect the problems. anyway - my first assumption if one of the mx can't access to one sql server, then none can do it. otherwise it's a real strange thing in case we can accept no greylisting just dunno. - try to use replication between sql servers. - allow write to the slave to and when the master wake up then replicate back the data too. - in my case actualy there is no master and slave just there is two sql server with the same database (or almost the same and there are certain point when they are syncing) and there is always one which is rw by all greylist server (first). imho the greylist database is not so complicated. it's easy to recognize which records should have to replicate. only old/expired record have to delete and always the last updated one is the latest and all record has timestemp (because that's the main purpose the database) so it's easy to know which is the last updated. > Here are simple questions to make sure we speak of the same things. Do > you agree with the following statements ? > - one and only one sql server should accept writes from every SQLgrey > instances. Let's call it the RW server (read-write). no. both, but all greylist server rw one of them at the same time. > - all other sql servers replicate the content of the server above in a > timely manner (using MySQL replication process for example). Let's call > them the RO servers (read-only). partly. it can be done by mysql replication or as i wrote above since from the database it's easy to recognize what should have to replicate it may can be done by the greylist server in a scheduled way (every minutes or 5 minutes updates the sql server which is not the current rw server). eg. update the current database plus the another database called "none-replicated", flush these data to the other server in every 1,5,10 minutes and do a full replication in every hour. i don't know which is the better and/or easier. > - when the RW server fails, > . either one of the slaves and *only* one switches to the RW status, > making sure all other RO servers are now synchronising with it (this is > not a simple process, I don't know how MySQL replication handles the > fact that RO servers will not necessarily have the same content when the > RW fails). > . either none of them switch to RW and SQLgrey now only have RO > databases so it cannot store new connection attempts in database : it > can't greylist anymore so it *must* let every message pass (-> slaves > won't be used at all). see above. > - when a failed server comes back online it must do so in a RO state if > there is a RW server already, if not it can become the RW server but > should have the most recent data available in its database when doing so > (or your auto-whitelisting efficiency will suffer and you can consider > reconnections as brand new connections). partly. i assume there is always one sql server which accept rw we can call this master. and even if the previous master come back that just become an sql server which periodicaly update his database (through mysql replication or greylist service). here i always talk about sql servers not greylist servers. suppose there is one greylist servers on all mx and two sql servers somewhere. one sql server is the up-to-date and one is the replication. as i wrote i assume that when the sql server gone it's gone for everyone. > If you don't agree with one of them, please explain why. i hope it's help. -- Levente "Si vis pacem para bellum!" |
From: Lionel B. <lio...@bo...> - 2004-12-15 11:20:39
|
Farkas Levente wrote the following on 12/15/04 10:35 : > > no. first of all my main question: > why do i worth to switch to sqlgrey (or any other greylist server) > from postgrey? > in normal cicumstances all mx host should have to use the same > greylist database otherwise the basic idea fail (delay can be too > long). which do not means that every mx should use the same greylist > server, but they have to use greylist servers which use the same > database. > currently we use one postgrey server and all other mx connect to this > postgrey server. but this is a sinlge point of failure! so the only > reason what is see to switch to another greylist server to avoid this > sinlge point of failure! but there is one more thing. the failure > usualy not in the greylist server (like postgrey never stop if > configured well), the critical part is the machine itself which runs > the greylist server. there can be hardware problem and there can be > network problem. > what i can image when we use an sql server as the database: > all mx use his own greylist server and all greylist server connect to > the same sql server. but in this case the same sinlge point of failure > exist! the sql server's machine! so therefore if i can configure more > sqlserver for each greylist server and the sql server's replicate the > database among eachother then the sinlge point of failure disappear! > so that can be a reason to switch. In the case of one sqlgrey instance on each mx even if you don't have *any* slave, as I explained you won't have any single point of failure (you won't have any greylisting when the database wil go down though). > >> [...] > > > imho replication is not the sqlgery's responsibility! I can't agree more :-) > [...] - that's why there is slave ldap servers, The devil is in the details. If I understand correctly, you want SQLgrey instances to know a slave or list of slaves and fallback automatically to the slave or one of the slaves if the master doesn't answer. What I don't know is how you will solve the following problems related to the writes done to the database by SQLgrey : - one SQLgrey can't access the master due to a temporary link problem and switch to the slave and try to update its database content although the slave won't authorize it (you can't allow writes to a slave or you'll end up with consistency problems or PRIMARY KEY collisions as I explained in a precedent message), should it scan the list of servers until it finds the one accepting writes ? - in case of multiple slaves, how do you make every SQLgrey instances aware of the one among them becoming the new master. If you want to try each server in a list looking for one that accept writes assuming one and only one from the databse server pool can accept them pay attention to the fact that managing your sql server pool will be quite error prone. You only have to forget to put a database coming back from a failure online without forbidding writes and you end up with the whole platform rejecting e-mails in a more or less randomly fashion. Here are simple questions to make sure we speak of the same things. Do you agree with the following statements ? - one and only one sql server should accept writes from every SQLgrey instances. Let's call it the RW server (read-write). - all other sql servers replicate the content of the server above in a timely manner (using MySQL replication process for example). Let's call them the RO servers (read-only). - when the RW server fails, . either one of the slaves and *only* one switches to the RW status, making sure all other RO servers are now synchronising with it (this is not a simple process, I don't know how MySQL replication handles the fact that RO servers will not necessarily have the same content when the RW fails). . either none of them switch to RW and SQLgrey now only have RO databases so it cannot store new connection attempts in database : it can't greylist anymore so it *must* let every message pass (-> slaves won't be used at all). - when a failed server comes back online it must do so in a RO state if there is a RW server already, if not it can become the RW server but should have the most recent data available in its database when doing so (or your auto-whitelisting efficiency will suffer and you can consider reconnections as brand new connections). If you don't agree with one of them, please explain why. If you agree with all of them, is there any database out there that allows you to setup a system which will handle all these requirements ? > i hope you understand my main problem/requirements/wish about a > greylist server. I think I begin to understand what you want, but I don't know yet how you want to solve the problems described above with MySQL for example. I don't think adding failover in SQLgrey won't do much good if there's no (easy) way to configure the database servers to respect the requirements of a failover environment. Please remember that you can already setup a failover onvironment with SQLgrey if you setup a take over IP process when you detect the RW database goes down. > if still not than it can be because my bad english:-( English is not my mother tongue either, don't worry :-) Lionel. |
From: Lionel B. <lio...@bo...> - 2004-12-15 10:15:39
|
Farkas Levente wrote the following on 12/15/04 10:39 : > > imho it'd be better to leave spf for another policy daemon (and there > are exist many good one). a greylist server should have to be only a > greylist server no more no less! i can repeat only two phrases: > - simplicity, generality, clarity! > - a program is not ready when there is no more think to add, it's > ready when there is nothing to remove! I agree on the principle. But here's the idea : if a domain uses SPF in a way that makes a connection authorized or forbidden it can help the decision process : - connection forbidded : don't try to greylist. This can be done at the Postfix level by chaining policy daemons, - connection authorized : you have 2 options, trust the domain admins and don't greylist or add your verification level by greylisting. I wonder if it's easy to configure in Postfix or even doable. - SPF can't help us (no record applying to the connection) : we want to greylist. I'm not yet fluent enough in Postfix configuration to write an HOWTO detailing how to configure it properly when using a separate SPF policy daemon. This is why the word "experiment" is used and this is left to do in development versions... > > just my 2c:-) > Don't worry, if it's simple to separate SPF and greylisting by configuring Postfix properly, I'll probably develop a pure SPF policy daemon or reuse one that fits our needs and write an howto for combining them. The goal is to have *optional* SPF support. Best regards, Lionel. |
From: Farkas L. <lf...@bp...> - 2004-12-15 09:39:32
|
Lionel Bouton wrote: > This is roughly organised by priority. Did I miss something ? > ... > 1.5.x development releases > - use time units everywhere to make configuration easier (add safety > barriers ?) > - experiment with SPF support imho it'd be better to leave spf for another policy daemon (and there are exist many good one). a greylist server should have to be only a greylist server no more no less! i can repeat only two phrases: - simplicity, generality, clarity! - a program is not ready when there is no more think to add, it's ready when there is nothing to remove! just my 2c:-) -- Levente "Si vis pacem para bellum!" |
From: Farkas L. <lf...@bp...> - 2004-12-15 09:34:52
|
Lionel Bouton wrote: > Farkas Levente wrote the following on 12/14/04 18:08 : > >>> [...]Anyway, with auto-whitelisting your users shouldn't notice much >>> of a delay. Just make the switch on Friday's evening and let the >>> marginal week-end trafic populate the auto-whitelist tables. >> >> >> >> that's not so easy! most of these important emails comes a few hours >> before deadlines (usualy thuesday 18:00 and thursday 18:00) which >> makes the think a bit complicated:-( >> > > Understandable, you might want to use an opt-in policy for greylisting. > Greylisting is a tradeoff that auto-whitelists can only make less > painfull to make. You should make your users aware that either : > - you use greylisting for all of them which means that poorly configured > mail servers won't deliver in a timely manner (and some rare ones never) > but on the other hand their SPAM level is less than half that what it > could be (remember that asking the sender to resend the message will > solve the problem in most cases). > - you use greylisting on an opt-in basis and they have to choose what > they consider more important : less SPAM or "instant messaging", their > choice, their responsibility. as always they would like both:-) but currently that's the situation since in postgery's database most of our partner's email address is already included so there is no delay mostof the case, but if i start with a fresh/clean sqlgrey that the delay happend with all emails:-((( >>> - if needed (sqlgrey can cope with database unavailability), >>> configure replication to a slave database. >> >> >> >> it is currently possible? > > > > It depends on the database system. Currently SQLgrey only connects to > one database (which would be the master) though. so currently can't configure a slave:-( >>> If you use a slave database, you can either : make it take over the >>> master's IP, update the DNS to reroute sqlgrey connections to the >>> slave (my personnal choice - put low TTLs in the DNS - this way you >>> don't have to take the master fully down if only its database system >>> is down), or (if it can be made quickly) put the master back online. >> >> >> >> imho it'd be better to be able to configure more (slave) sql server >> for sqlgrey in stead of dns manipulation. >> > > I'm not sure I understand. Do you mean that SQLgrey should directly > access several servers and update all of them or do you want replication > done at the database level (SQLgrey being unaware of the process > replicating the master to the slaves) ? no. first of all my main question: why do i worth to switch to sqlgrey (or any other greylist server) from postgrey? in normal cicumstances all mx host should have to use the same greylist database otherwise the basic idea fail (delay can be too long). which do not means that every mx should use the same greylist server, but they have to use greylist servers which use the same database. currently we use one postgrey server and all other mx connect to this postgrey server. but this is a sinlge point of failure! so the only reason what is see to switch to another greylist server to avoid this sinlge point of failure! but there is one more thing. the failure usualy not in the greylist server (like postgrey never stop if configured well), the critical part is the machine itself which runs the greylist server. there can be hardware problem and there can be network problem. what i can image when we use an sql server as the database: all mx use his own greylist server and all greylist server connect to the same sql server. but in this case the same sinlge point of failure exist! the sql server's machine! so therefore if i can configure more sqlserver for each greylist server and the sql server's replicate the database among eachother then the sinlge point of failure disappear! so that can be a reason to switch. > The former is doable but will be quite complex : SQLgrey would have to > support adding an empty database to a farm of databases and populate the > tables of each database without data allowing the message to pass when > there's at least another database with data making SQLgrey decice to > accept it. This must be done at every step of the decision process > (valid previous connection attempt, e-mail auto-whitelist entry, domain > auto-whitelist entry). imho replication is not the sqlgery's responsibility! > If this is what you want, I'm afraid it should be another project : > SQLgrey current model is not the best suited for this, you'll want to > make the same request to different databases in // and wait for all of > them to complete or timeout, mark databases as faulty to avoid spending > time waiting for timeouts, ... i hope my former explanation show what i want:-) > In the latter case you want SQLgrey being aware of the fact that there > is a replication process occuring between several databases and one of > them is the master, you want to be ensure only one is in RW mode, this > one is known by SQLgrey and when it goes down an external process > decides which slave becomes the master and > - do what's needed to reconfigure it as the master, > - signal each SQLgrey server to use this new one. don't go that far! it seems to me that you always assume the failuer at the sql server level. i repeat myself: i trust in the sql server (never died), but i do not trust the machine and the network! and that'we what i'd like to avoid! usualy tha't the main reason of slave servers: - that's why there is more mx, - that's why there is slave ldap servers, - that's why there backup domain controllers on windows, - a bit similar raid-1,5,6 (one or two disk can fail at the same time, but no more). so if there is one master and in case of the failure of this (ie. not reachable the greylist can switch to another one which has the same (or almost the same) database (ie. relicated or replicated eg. in the last hour). that can be enough! the sysadm can recognize that can fix the problems (like fix the mx, ldap server, domain contorller or replace the failed disk in the above example). > For that, I only see one thing needed on SQLgrey's side : modify SQLgrey > in order to allow on the fly reconnection to another database. The rest > is database specific. > But I don't really see the benefit of making this so complex, usually > replicated databases come with what's needed to make a slave replace a > master by taking over its IP address. In this case SQLgrey will work > correctly out of the box. > >> imho it's be enough to switch to the slave and the slave can replicate >> eg. once a day the master (before becoming the master) > > > > I'm not sure I understand. i hope you understand my main problem/requirements/wish about a greylist server. if still not than it can be because my bad english:-( yours. -- Levente "Si vis pacem para bellum!" |
From: Josh E. <jo...@en...> - 2004-12-15 08:38:13
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Lionel Bouton wrote: | But it doesn't work : the first pool did already accept the message when | the second wants to greylist. Ahh I see now, this is what I was thinking also. I guess it will be most effective if done on the first machine. | I just realised that in fact it shouldn't be possible to do greylisting | after alias expansion. Let me explain : | - Postfix handles the domain example.com | - there'a an alias "adm...@ex..." expanding to the final | recipients "pos...@ex..." and "ro...@ex...". | - Postfix wants messages to root being greylisted and not messages to | postmaster. | - The greylister doesn't know yet that "se...@ot..." on | 123.48.12.58 is a valid couple. | - se...@ot... sends an e-mail from 123.48.12.58 to | adm...@do... | What should Postfix do ? It can't refuse the mail because postmaster | doesn't want its incoming messages to be greylisted but at the same time | root doesn't want to receive messages that haven't been greylisted so it | can't accept it either. | | Conclusion : no greylisting before alias expansion I'm confused. First you said it shouldn't be possible to greylist after expansion, then you said no greylisting before expansion. I'm guessing Postfix will do the alias resolution before policy, as different "real" users may have different policies, but that's just a hunch. I can test this to find out what happens. Josh -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBv/YDV/+PyAj2L+IRAuzPAJ41QAQkBOznOOPx4hMMrh+q6Brt6QCgrHY7 WcbailgdvWjnqiwHSAPoMoY= =AVRV -----END PGP SIGNATURE----- |
From: Lionel B. <lio...@bo...> - 2004-12-15 07:30:20
|
This is roughly organised by priority. Did I miss something ? -- BEGIN 1.4.x - fix EGID bug (done in CVS), - allow distributing whitelists from a central point . reload whitelists on SIGUSR1 (done in CVS, not tested) . separate script which updates the whitelist and calls SIGUSR1 - send mails (adjustable rate) to postmaster when the database goes down and comes back - make maint_delay configurable 1.5.x development releases - use time units everywhere to make configuration easier (add safety barriers ?) - experiment with SPF support - consider optin/optout (is it easy to do with Postfix? Then provide an HOWTO else code it in SQLgrey and document it) - support migrating from another greylister by calling it to learn its auto-whitelists - external cleanup process/experiment with an adaptative cleanup algorithm As soon as available... - gather enough perf data to decide on indices to add -- END I thought that starting with 1.4.0 development would slow down :-) People asking for whitelist reloads probably noticed that I don't use SIGHUP for whitelists reloading as initially planned. This is because when I implemented the reloading code in SQLgrey I discovered that SIGHUP handling was done by Net::Server. I tried to use the appropriate hooks to reload the whitelists but it seems Net::Server::Multiplex SIGHUP handling is buggy (seems there's a race condition related to the pidfile removal when it restarts the process that leads to a complete crash), so I went the USR1 way... When the example script and the distribution server are both ready I'll release 1.4.1. Lionel. |
From: Lionel B. <lio...@bo...> - 2004-12-15 07:08:23
|
Josh Endries wrote the following on 12/15/04 07:40 : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Lionel Bouton wrote: > | I didn't realise you could make Postfix use the greylisting policy > | daemon after alias expansion. How do you do that ? > > Well, I lied (kinda). I do it via multiple instances (actually, > multiple physical servers). > I'm not sure I see how you do it. Here's what I imagine (probably because this was the process I thought of when trying to) : - a first pool accepts the messages, processes alias expansion and forwards the messages to a second pool - the second pool greylists. But it doesn't work : the first pool did already accept the message when the second wants to greylist. I just realised that in fact it shouldn't be possible to do greylisting after alias expansion. Let me explain : - Postfix handles the domain example.com - there'a an alias "adm...@ex..." expanding to the final recipients "pos...@ex..." and "ro...@ex...". - Postfix wants messages to root being greylisted and not messages to postmaster. - The greylister doesn't know yet that "se...@ot..." on 123.48.12.58 is a valid couple. - se...@ot... sends an e-mail from 123.48.12.58 to adm...@do... What should Postfix do ? It can't refuse the mail because postmaster doesn't want its incoming messages to be greylisted but at the same time root doesn't want to receive messages that haven't been greylisted so it can't accept it either. Conclusion : no greylisting before alias expansion Lionel. |
From: Josh E. <jo...@en...> - 2004-12-15 06:49:13
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Lionel Bouton wrote: | I didn't realise you could make Postfix use the greylisting policy | daemon after alias expansion. How do you do that ? Well, I lied (kinda). I do it via multiple instances (actually, multiple physical servers). | If it can be done there, that's good please explain to the list how you | do it and I'll make it an HOWTO. I saw at least another greylisting | implementation providing optin optout, so I'm wondering why they had to | do it. I'm pretty confident I'll get something worked out as this would be a great thing to offer. I need to finish my Horde module first, though. I'll be sure to share my findings. :) Thanks, Josh -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD4DBQFBv9x4V/+PyAj2L+IRAjVyAJ9eCHWGZYuDzCxGhP/K+iPpS0iDuwCYrSod hXOTt2h8FO2LHtj3FgUbYQ== =4Vqm -----END PGP SIGNATURE----- |
From: HaJo S. <ha...@ha...> - 2004-12-15 05:34:36
|
On Wed, December 15, 2004 1:38, Lionel Bouton said: > I can add opt-in and opt-out support if needed. I think this is an excellent idea! One more for your little TODO ;-) -- HaJo Schatz <ha...@ha...> http://www.HaJo.Net PGP-Key: http://www.hajo.net/hajonet/keys/pgpkey_hajo.txt |