From: Rene J. <rg...@ba...> - 2005-01-12 12:36:41
Attachments:
sqlgrey-rgj.patch
|
Hi there. SQLgrey is still running fine, upgraded to 1.4.1 a couple of days ago. I'm seeing the same Sys::Syslog errors as Michael (Debian testing). I've written a small patch for SQLgrey so it'll write out how long time the cleaning took and how much was deleted. It won't iterate on the deleted lines in connect if the loglevel is too low and i've also adjuste= d the loglevel for some errors which i would like to see even when running in quiet mode. I've attached the patch, use it if you can. --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2005-01-12 23:06:39
|
Rene Joergensen wrote the following on 01/12/05 13:17 : >Hi there. > >SQLgrey is still running fine, upgraded to 1.4.1 a couple of days ago. > >I'm seeing the same Sys::Syslog errors as Michael (Debian testing). > > > Solved in my tree. Sys::Syslog doc isn't clear and it seems the behaviour changes between Net::Server or Sys::Syslog versions (don't realy know what happens to the Net::Server logopts content, I'll have to look into Net::Server and maybe Net::Server::Multiplex)... I have one FC3 system where it didn't log the pid but showed the error messages and a Gentoo where it log the pids (?!) and doesn't show any error (?!?!). Now they both don't show any errors, but I have yet to understand the pid logging with the Gentoo... >I've written a small patch for SQLgrey so it'll write out how long time >the cleaning took and how much was deleted. > Cool. Never thought of that but it's warmly welcomed. > It won't iterate on the >deleted lines in connect if the loglevel is too low > Even cooler ! > and i've also adjusted >the loglevel for some errors which i would like to see even when running >in quiet mode. > > > Ok, the only things I changed is the log level used for the time spent cleaning (in quiet mode I don't want a log line every 30 minutes) and the log syntax (it's a matter of taste). Maybe a log level won't suit us anymore as we don't put the same loglevel on the same information. We may have to switch to a : log=item1,item2,... format which in practice could look like - log=dbcleanup,awl_populate,delay,spam,awl_connections for someone who wants everything (the word 'everything' could be a shortcut), - log=dbcleanup for someone who wants to monitor the DB perf (the cleanup should be the most intensive DB op). errors and db conversion would implicitely be logged whatever we set log to. 1.5.x material. >I've attached the patch, use it if you can. > > It is now in my tree and your name in the Changelog. I'll have to add a THANKS file for you and most of the subscribers here. Last time I looked there were less than 20 people subscribed and a large proportion helped with sound enhancement requests, performance reports, bug reports and so on, thanks guys. I've yet to test it and then issue a 1.4.2 release. Thanks, Lionel. |
From: Rene J. <rg...@ba...> - 2005-01-13 09:48:29
|
On Thu, Jan 13, 2005 at 12:05:39AM +0100, Lionel Bouton wrote: > >I've written a small patch for SQLgrey so it'll write out how long tim= e > >the cleaning took and how much was deleted. > Cool. Never thought of that but it's warmly welcomed. It's nice to know how long time SQLgrey didn't answer on requests from Postfix, and also how many rows that was deleted, without writing a lot of lines to syslog (20-30k lines pr. hour in our setup) > >It won't iterate on the deleted lines in connect if the loglevel is > >too low > Even cooler ! Seemed like a waste of time, it could probably take a second or two more. I actually looked for something like $self->{server}{log_level}, but couldn't find it :-) > Ok, the only things I changed is the log level used for the time spent=20 > cleaning (in quiet mode I don't want a log line every 30 minutes) and=20 > the log syntax (it's a matter of taste). The log syntax was inspired by Diablo (NNTP-software) because I couldn't come up with something myself :) > Maybe a log level won't suit us anymore as we don't put the same=20 > loglevel on the same information. [...] Sounds like a cool idea, with the large volume of mail in our system we (me, myself and I) don't want the "Probable spam" logging, but I would like to see the lines from the DB-cleaning (Right now i've just changed the loglevel on that one line in 1.4.2), so tuneable logging would be a nice feature. Maybe you should start the 1.5.x branch :o) > It is now in my tree and your name in the Changelog. I'll have to add a= =20 > THANKS file for you and most of the subscribers here. Last time I looke= d=20 > there were less than 20 people subscribed and a large proportion helped= =20 > with sound enhancement requests, performance reports, bug reports and s= o=20 > on, thanks guys. It always nice to help enhancing a good piece of software/idea. Speaking of performance reports, do you need more info to be able to decide on which indices to use? --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2005-01-13 10:16:33
|
Rene Joergensen wrote the following on 01/13/05 10:48 : >It always nice to help enhancing a good piece of software/idea. > >Speaking of performance reports, do you need more info to be able to >decide on which indices to use? > > In 1.5.x there will be an index on connect.ip_addr. This is the most obvious one and greatly improved the load on a mysql database (see db_performance_reports). There could be some benefit for indices on the timestamp in the 3 tables, but surprisingly it seems the cleanup process (which is the obvious target for timestamp related optimisations) didn't benefit much from it when tested. As adding indices slow the database writes I'm not considering adding timestamp ones yet (until someone brings test results where the db_cleanup is reduced from say 5s to 0.5s when adding one of them for example). 1.5.x won't start until February as I'll be skiing from Saturday until then and 1.4.x still has some TODOs. Lionel. |
From: Rene J. <rg...@ba...> - 2005-01-13 11:59:48
|
On Thu, Jan 13, 2005 at 11:16:08AM +0100, Lionel Bouton wrote: > In 1.5.x there will be an index on connect.ip_addr. This is the most=20 > obvious one and greatly improved the load on a mysql database (see=20 > db_performance_reports). Yeah, i know ;-) > There could be some benefit for indices on the timestamp in the 3=20 > tables, but surprisingly it seems the cleanup process (which is the=20 > obvious target for timestamp related optimisations) didn't benefit much= =20 > from it when tested. As adding indices slow the database writes I'm not= =20 > considering adding timestamp ones yet (until someone brings test result= s=20 > where the db_cleanup is reduced from say 5s to 0.5s when adding one of=20 > them for example). I tried removing the index i had on connect.first_seen. It doesn't really make a big difference, cleaning now takes 1-2 seconds instead of 0-1 seconds before removing it. Mysqld doesn't seem to use more CPU and the queries in connect is executed in 0.00 seconds when performed manually. > 1.5.x won't start until February as I'll be skiing from Saturday until=20 > then and 1.4.x still has some TODOs. Have you looked at the automatic whitelist updating? Or should i try writing something using LWP. I guess the most reliable method is fetching via HTTP and comparing MD5-sums afterwards. Or did you have something different in mind? --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2005-01-13 12:37:23
|
Rene Joergensen wrote the following on 01/13/2005 12:59 PM : >>1.5.x won't start until February as I'll be skiing from Saturday until >>then and 1.4.x still has some TODOs. >> >> > >Have you looked at the automatic whitelist updating? Or should i try >writing something using LWP. I guess the most reliable method is >fetching via HTTP and comparing MD5-sums afterwards. Or did you have >something different in mind? > > > I would have used the following : a new entry in sqlgrey.conf like "whitelist_rooturl = http://sqlgrey.bouton.name/whitelists" Then the update script (be it bash using wget, perl using LWP or whatever) will : - create a temporary directory in /tmp with mktemp -d, - fetch two md5 files, (named root_url/<whitelist_file>.md5), with timestamps (wget -N) compare them to the md5 in /etc/sqlgrey/, if one of them is newer (or there's no md5 in /etc/sqlgrey), continue, else abort, - fetch the missing whitelist files, compare md5, if successfull continue, else output an error (lets cron manage the mail handling), - optionnaly (new conf var: update_whitelist_showdiff != 0), show the diffs on the standard output and let cron send it to the admin, - move the whitelists and the *.md5 to /etc/sqlgrey, - send SIGUSR1 to the pid in /var/run/sqlgrey.pid, - cleanup the temp dir. I'll try to find the time to code this before Saturday and release 1.4.3. If people start hammering the poor whitelist server, I'll switch to the clamav way of managing this : use DNS to store a whitelist version. This has been proven quite efficient. Lionel. |