You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(10) |
Nov
(37) |
Dec
(66) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(52) |
Feb
(136) |
Mar
(65) |
Apr
(38) |
May
(46) |
Jun
(143) |
Jul
(60) |
Aug
(33) |
Sep
(79) |
Oct
(29) |
Nov
(13) |
Dec
(14) |
2006 |
Jan
(25) |
Feb
(26) |
Mar
(4) |
Apr
(9) |
May
(29) |
Jun
|
Jul
(9) |
Aug
(11) |
Sep
(10) |
Oct
(9) |
Nov
(45) |
Dec
(8) |
2007 |
Jan
(82) |
Feb
(61) |
Mar
(39) |
Apr
(7) |
May
(9) |
Jun
(16) |
Jul
(2) |
Aug
(22) |
Sep
(2) |
Oct
|
Nov
(4) |
Dec
(5) |
2008 |
Jan
|
Feb
|
Mar
(5) |
Apr
(2) |
May
(8) |
Jun
|
Jul
(10) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
(32) |
May
|
Jun
(7) |
Jul
|
Aug
(38) |
Sep
(3) |
Oct
|
Nov
(4) |
Dec
|
2010 |
Jan
(36) |
Feb
(32) |
Mar
(2) |
Apr
(19) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
(8) |
Dec
|
2011 |
Jan
(3) |
Feb
|
Mar
(5) |
Apr
|
May
(2) |
Jun
(1) |
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
(6) |
2012 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
(1) |
Oct
|
Nov
(6) |
Dec
(10) |
2014 |
Jan
(8) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
(34) |
Aug
(6) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(18) |
Jul
(13) |
Aug
(30) |
Sep
(4) |
Oct
(1) |
Nov
|
Dec
(4) |
2016 |
Jan
(2) |
Feb
(10) |
Mar
(3) |
Apr
|
May
|
Jun
(11) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Michel B. <mi...@bo...> - 2005-02-05 12:58:24
|
Le Samedi 05 F=E9vrier 2005 13:49, Lionel Bouton a =E9crit : > > counters : what for ? As I previously said, I'm not sure they will be > usefull as logs already hold more relevant information. It will hurt th= e > performance of people willing to use them too (if I make them optionnal > as requested) as this will need an update to the database on each and > every mail SQLgrey will see (db updates are slow...). There's already a database update on each and every mail that SQLgrey see= s :=20 The last_seen entry gets updated. I don't think that updating last_seen + counter would result in any notic= eable=20 performance difference compared to updating last_seen alone... Cheers. --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Lionel B. <lio...@bo...> - 2005-02-05 12:50:05
|
Michel Bouissou wrote the following on 02/05/05 10:09 : >Le Vendredi 04 F=E9vrier 2005 21:18, Lionel Bouton a =E9crit : > =20 > >>Seems to be a good thing to have, especially since it will be a small >>column that shouldn't put much stress on the database. Added to my TODO= . >> >> =20 >> >>>2. Addition: client_name >>> =20 >>> >>I'm afraid this won't be so easy : >>- in the default 'smart' mode, most of the entries aren't IP address bu= t >>class C networks. >>- a VARCHAR column with potentially very long names is not a welcomed >>addition : it could hurt performance badly. >> =20 >> > >I would vote YES for "first_seen" and "counter" columns > I see the need for a first_seen : logs are mostly useless for this=20 information. counters : what for ? As I previously said, I'm not sure they will be=20 usefull as logs already hold more relevant information. It will hurt the=20 performance of people willing to use them too (if I make them optionnal=20 as requested) as this will need an update to the database on each and=20 every mail SQLgrey will see (db updates are slow...). If someone tells me what it will be used for that can't be done with a=20 simple log parsing tool, I'd be more inclined to put it in my TODO list.=20 In the other case, I'll add a logparsing tool in the TODO... > in "awl" tables, and I=20 >would vote NO for "client_name". > >Not only "client_name" would be meaningless for all the class-C records,= but=20 >also a reverse DNS entry is something that can change over time. If we p= ut a=20 >client_name column, we should update it with the client_name Postfix giv= es=20 >everytime we update a record. (note that we could very well put there th= e=20 >last client_name seen even when we use a class-C entry type, as this wou= ld=20 >still give an indication about the calling client). >But I'm not sure that having this in the tables would be very useful. > > =20 > Agreed. >I would also vote YES for having the same field name for the IP in all t= he=20 >tables. > >Adding columns to the tables rises the issue of upgrading : When upgradi= ng=20 >from an oder version to a version with new columns, should the new colum= ns be=20 >initialized with arbitrary blank/zero or "today" values, or completely d= rop=20 >the existing tables and let them rebuild by themselves from scratch with= new,=20 >real data ? > > =20 > For first_seen, my plan is to make the update process set them to last_se= en. If counters are really needed, they will be set to 0. Lionel. |
From: Michel B. <mi...@bo...> - 2005-02-05 09:35:26
|
Hello, Here are some samples or VERP-style single-use sender addresses local parts that sqlgrey currently doesn't recognize, and that cause every mail from the same origin to get greylisted again. Some may cause the sending domain to end in the "domain_awl" if the sample gets big enough, but some being weekly or monthly newsletters may never produce the requested number, and get greylisted forever... bounce-#-d03ca8744c532662291bfe53e62dddb3eab6aa94-# photoservice-xtc-9x9-0lac-dd-c2t6m photoservice-xtc-9x9-zj8o-dd-c2t6m fr2_2743_html-2743_4 fr_2760_html-2760_1098 fr_2785_html-2785_1098 fr_2811_html-2811_1137 fr_2833_html-2833_1137 I suggest that we should replace by # any series of possibly HEX numbers [0-9A-Fa-f] separated from the rest of the address by one of [._-] The "photoservice" samples would probably be harder to recognize. Also, I believe that sqlgrey has no provision to recognize single-use sender addresses including hashes such as what systems like SRS, SES or BATV can produce... Cheers. -- Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Michel B. <mi...@bo...> - 2005-02-05 09:10:01
|
Le Vendredi 04 F=E9vrier 2005 21:18, Lionel Bouton a =E9crit : > > Seems to be a good thing to have, especially since it will be a small > column that shouldn't put much stress on the database. Added to my TODO= . > > >2. Addition: client_name > > I'm afraid this won't be so easy : > - in the default 'smart' mode, most of the entries aren't IP address bu= t > class C networks. > - a VARCHAR column with potentially very long names is not a welcomed > addition : it could hurt performance badly. I would vote YES for "first_seen" and "counter" columns in "awl" tables, = and I=20 would vote NO for "client_name". Not only "client_name" would be meaningless for all the class-C records, = but=20 also a reverse DNS entry is something that can change over time. If we pu= t a=20 client_name column, we should update it with the client_name Postfix give= s=20 everytime we update a record. (note that we could very well put there the= =20 last client_name seen even when we use a class-C entry type, as this woul= d=20 still give an indication about the calling client). But I'm not sure that having this in the tables would be very useful. I would also vote YES for having the same field name for the IP in all th= e=20 tables. Adding columns to the tables rises the issue of upgrading : When upgradin= g=20 from an oder version to a version with new columns, should the new column= s be=20 initialized with arbitrary blank/zero or "today" values, or completely dr= op=20 the existing tables and let them rebuild by themselves from scratch with = new,=20 real data ? Regards. --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Lionel B. <lio...@bo...> - 2005-02-04 20:18:57
|
Michael Storz wrote the following on 02/04/05 17:25 : >Hi Lionel, > >to be a little bit more detailed than Max :-) > > > I must admit I had to read Max's mail twice... >We just started to use greylisting and the first day shows a reduction of >spam by about a factor of 15, that's really great. However, looking >around in the logfiles and the mysql database, I am missing some >information, to help me see what actually happens. > >Therefore I would like three additions to the tables in the database. > >1. Addition: first_seen > >Extra field first_seen also for tables form_awl and domain_awl. With this >addition you are able to see which new entries have been entered into >the database like it is possible now with table connect: > >select * from connect where first_seen > now() - interval 5 minute; > >With the from_awl and domain_awl you can only find out which entries have >been added OR have been updated. > > > Seems to be a good thing to have, especially since it will be a small column that shouldn't put much stress on the database. Added to my TODO. >2. Addition: client_name > >Extra field client_name in all 3 tables. This would help a human to see >from where a connection came. Otherwise, you must always use nslookup or >dig to find the name. > > > I'm afraid this won't be so easy : - in the default 'smart' mode, most of the entries aren't IP address but class C networks. - a VARCHAR column with potentially very long names is not a welcomed addition : it could hurt performance badly. >3. Addition: usage_count > >Every update of an entry in from_awl and domain_awl should increment an >usage_count. > > > I like the concept of storing somewhere usage data, but will an usage_count be enough ? For example I'd like to know the top 10 domain_awl entries used last week, but usage_count won't give them to me. I'm wondering if what you need isn't a separate log file parser that would get you various statistics (most frequent spam sources, most used domains, AWL efficiency, ...). The log file should already have everything you need to compute these stats. >The processing of these fields by sqlgrey should be triggered by >configuration options. For people, who do not need the information and do >not want to waste storage, they would disable these features. > >4. Consistent naming > >In table connect ip_addr is used whereas host_ip in from_awl and >domain_awl. Since it depends on the greylisting mode if the IP address is >a full host address or a class C network, you should use ip_addr for all >three tables. Now, if you try to find out what information is in every >table about an IP address, you can't just change the tablename in the >select, but you have to change the fieldname too. > > Agreed. Added to my TODO. Lionel. |
From: Michael S. <Mic...@lr...> - 2005-02-04 16:25:45
|
Hi Lionel, to be a little bit more detailed than Max :-) We just started to use greylisting and the first day shows a reduction of spam by about a factor of 15, that's really great. However, looking around in the logfiles and the mysql database, I am missing some information, to help me see what actually happens. Therefore I would like three additions to the tables in the database. 1. Addition: first_seen Extra field first_seen also for tables form_awl and domain_awl. With this addition you are able to see which new entries have been entered into the database like it is possible now with table connect: select * from connect where first_seen > now() - interval 5 minute; With the from_awl and domain_awl you can only find out which entries have been added OR have been updated. 2. Addition: client_name Extra field client_name in all 3 tables. This would help a human to see from where a connection came. Otherwise, you must always use nslookup or dig to find the name. 3. Addition: usage_count Every update of an entry in from_awl and domain_awl should increment an usage_count. The processing of these fields by sqlgrey should be triggered by configuration options. For people, who do not need the information and do not want to waste storage, they would disable these features. 4. Consistent naming In table connect ip_addr is used whereas host_ip in from_awl and domain_awl. Since it depends on the greylisting mode if the IP address is a full host address or a class C network, you should use ip_addr for all three tables. Now, if you try to find out what information is in every table about an IP address, you can't just change the tablename in the select, but you have to change the fieldname too. Thanks, Michael On Fri, 4 Feb 2005, Max Diehn wrote: > Hi Lionel, > > what do You think about the following issues to ease data mining: > > -> field 'client_name' in connect > -> field 'first_seen' in from_awl, domain_awl > -> logging (now()-first_seen) within 'too early' - statements in the > logfile (easier to grep) > -> putting all info concerning a single transaction into one single line > instead of two lines (makes it easier to grep) > > BTW, could You rename connect.ip_addr into host_ip for consistency with > from_awl and domain_awl? I understand, that, from a point of view of > smart or c-class greylisting, these are different concepts. But from the > sql schema point of view I find it rather uncomfortable to use different > names for this field. > > LBNL, hope You enjoyed skiing and returned in good health when You read > this (I start my first skiing holiday in my life tonight!) > > Max > Michael Storz ------------------------------------------------- Leibniz-Rechenzentrum ! <mailto:St...@lr...> Barer Str. 21 ! Fax: +49 89 2809460 80333 Muenchen, Germany ! Tel: +49 89 289-28840 |
From: Max D. <Max...@lr...> - 2005-02-04 15:32:19
|
Hi Lionel, what do You think about the following issues to ease data mining: -> field 'client_name' in connect -> field 'first_seen' in from_awl, domain_awl -> logging (now()-first_seen) within 'too early' - statements in the logfile (easier to grep) -> putting all info concerning a single transaction into one single line instead of two lines (makes it easier to grep) BTW, could You rename connect.ip_addr into host_ip for consistency with from_awl and domain_awl? I understand, that, from a point of view of smart or c-class greylisting, these are different concepts. But from the sql schema point of view I find it rather uncomfortable to use different names for this field. LBNL, hope You enjoyed skiing and returned in good health when You read this (I start my first skiing holiday in my life tonight!) Max |
From: Lionel B. <lio...@bo...> - 2005-02-02 13:26:18
|
Klaus Alexander Seistrup wrote the following on 01/28/2005 10:06 AM : >For those of us who like to run SQLgrey under e.g. runit=B9 or >d=E6montools=B2 it makes more sense to log to stdout/stderr than to >syslog. Please bring back the > > log_file =3D> $opt{daemonize} ? 'Sys::Syslog' : undef, > >stanza. Thanks. :-) > =20 > Will be done in 1.4.3. |
From: Klaus A. S. <kse...@gm...> - 2005-01-28 09:06:58
|
For those of us who like to run SQLgrey under e.g. runit=B9 or d=E6montools=B2 it makes more sense to log to stdout/stderr than to syslog. Please bring back the log_file =3D> $opt{daemonize} ? 'Sys::Syslog' : undef, stanza. Thanks. :-) Cheers, // Klaus =B9) http://smarden.org/runit/ =B2) http://cr.yp.to/daemontools.html --=20 Klaus Alexander Seistrup SubZeroNet =B7 Copenhagen =B7 Denmark |
From: Lionel B. <lio...@bo...> - 2005-01-22 18:59:05
|
Michel Bouissou wrote the following on 01/22/05 19:47 : ># 19/01/2005 ># camppool03.emailebay.com[216.33.244.102] ># from=<ebay.#.#.#@reply.ebay.com> helo=<camp13.sjc.ebay.com> ># Comment: Apparently legit eBay domain / mailserver > > According to the whois, the domain at least belongs to eBay. Added. Lionel. |
From: Michel B. <mi...@bo...> - 2005-01-22 18:47:58
|
# 19/01/2005 # camppool03.emailebay.com[216.33.244.102] # from=<ebay.#.#.#@reply.ebay.com> helo=<camp13.sjc.ebay.com> # Comment: Apparently legit eBay domain / mailserver # Reason: No retry. /^camppool\d+\.emailebay\.com$/ -- Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Lionel B. <lio...@bo...> - 2005-01-21 22:37:59
|
Michel Bouissou wrote the following on 01/17/05 07:44 : >Le Dimanche 16 Janvier 2005 22:39, Klaus Alexander Seistrup a =E9crit : > =20 > >>On <http://sqlgrey.sf.net/> I read: >> >><snip> >>Beginning with 1.2 and 1.3, odd subversion releases are considered >>stable, bugfix only releases, even subversion releases are for >>developpers, betatesters and people living on the bleeding edge. >> >>The latest stable version is now SQLgrey-1.4.2 >>The 1.5 branch isn't opened yet. >></snip> >> >>So . . . how do I interpret this? >> =20 >> > >Looks like a typo to me... Usually, even versions are stable, and odd ar= e=20 >development. > =20 > Typo ! Corrected. I'm back from my first skiing week (pause of 3 days): good weather, then=20 bad, then smooth snow, then ugly weather. Looking back : good week overal= l. Lionel. |
From: Michel B. <mi...@bo...> - 2005-01-18 06:58:54
|
Hi there, This IP has been treated as "full" instead of "C Class" by SQLgrey, probably due to the fact that the rDNS is an alias: [root@totor sqlgrey]# host 193.252.22.175 175.22.252.193.in-addr.arpa is an alias for 175.160-28.22.252.193.in-addr.arpa. 175.160-28.22.252.193.in-addr.arpa domain name pointer smtp.voila.fr. Cheers. -- Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Michel B. <mi...@bo...> - 2005-01-17 06:45:20
|
Le Dimanche 16 Janvier 2005 22:39, Klaus Alexander Seistrup a =E9crit : > > On <http://sqlgrey.sf.net/> I read: > > <snip> > Beginning with 1.2 and 1.3, odd subversion releases are considered > stable, bugfix only releases, even subversion releases are for > developpers, betatesters and people living on the bleeding edge. > > The latest stable version is now SQLgrey-1.4.2 > The 1.5 branch isn't opened yet. > </snip> > > So . . . how do I interpret this? Looks like a typo to me... Usually, even versions are stable, and odd are= =20 development. The fact that SQLgrey 1.4.2 is fairly stable, and that the 1.5 branch isn= 't=20 opened yet, makes me assume this is the same here. --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Allan J. <al...@no...> - 2005-01-16 21:56:46
|
Klaus Alexander Seistrup wrote, On 16-01-2005 22:43: > P.S.: On a side note, let me tell you that at least two of the major > ISP's here in Denmark -- Telia and Orange -- are using greylisting (I > don't know which d=E6mon they're using, though). Cool, huh? Well, maybe Rene Joergensen will reveal which GL daemon Telia is using ;) --=20 Allan Joergensen - http://nowhere.dk/ "I'm appealing against my exam results" Orville remarked. |
From: Klaus A. S. <kse...@gm...> - 2005-01-16 21:43:37
|
P.S.: On a side note, let me tell you that at least two of the major ISP's here in Denmark -- Telia and Orange -- are using greylisting (I don't know which d=E6mon they're using, though). Cool, huh? --=20 Klaus Alexander Seistrup SubZeroNet =B7 Copenhagen =B7 Denmark |
From: Klaus A. S. <kse...@gm...> - 2005-01-16 21:39:53
|
Hi, On <http://sqlgrey.sf.net/> I read: <snip> Beginning with 1.2 and 1.3, odd subversion releases are considered stable, bugfix only releases, even subversion releases are for developpers, betatesters and people living on the bleeding edge. The latest stable version is now SQLgrey-1.4.2 The 1.5 branch isn't opened yet. </snip> So . . . how do I interpret this? Cheers, --=20 Klaus Alexander Seistrup SubZeroNet =B7 Copenhagen =B7 Denmark |
From: Michel B. <mi...@bo...> - 2005-01-15 09:24:46
|
# 15/01/2005 # smtp.mandrake.org[212.43.244.24] # from=<re...@ma...> helo=<smtp.mandrake.org> # new: 212.43.244: re...@ma... # Comment: Newsletter wanted by user # Reason: No retry. -- Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Michel B. <mi...@bo...> - 2005-01-14 09:07:08
|
# 12/01/2005 # unitymail.alapage.com[195.101.94.169] # from=<05_...@ac...> # helo=<unitymail.alapage.com> # new: 195.101.94: 05_01_11_teaser_soldes_htm.um.a.#.#@actu.alapage.com # Comment: Newsletter wanted by user - Unmanaged VERP sender # Reason: No retry. -- Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |
From: Michel B. <mi...@bo...> - 2005-01-13 13:46:57
|
Le Jeudi 13 Janvier 2005 11:41, Lionel Bouton a =E9crit : > > >This is also an option, but the decision wether or not to use the "ful= l" > >algorithm in not perfect. > > Quite true, this is only an heuristic. > > > I've seen many cases where SQLgrey uses the "class > >C" algorithm for end-user DSL addresses. The way different ISPs name t= heir > >end-user pools can vary quite a lot... > > Could you report them to me ? If possible I'd like to make SQLgrey awar= e > of them. Depending upon how you isolate the last byte of the IP address, such a li= st of=20 misses could include hostnames like (analyzed from my own mailserver logs= ):=20 1Cust166.tnt34.rtm1.nld.da.uu.net[213.116.162.166] 1Cust34.tnt10.ber2.deu.da.uu.net[149.225.214.34] 1Cust63.tnt2.ber2.deu.da.uu.net[149.225.54.63] 40322032.ptr.dia.nextlink.net[64.50.32.50] ACB01AF0.ipt.aol.com[172.176.26.240] ACB2B019.ipt.aol.com[172.178.176.25] ACB4C175.ipt.aol.com[172.180.193.117] ACB59C5D.ipt.aol.com[172.181.156.93] ACB6068A.ipt.aol.com[172.182.6.138] ACB8796A.ipt.aol.com[172.184.121.106] ACBBCA2F.ipt.aol.com[172.187.202.47] anba-c34712aa.pool.mediaWays.net[195.71.18.170] asd-blm-2066b.adsl.wanadoo.nl[81.70.36.107] bxy114.neoplus.adsl.tpnet.pl[83.30.18.114] c8a65a38.bhz.virtua.com.br[200.166.90.56] c9069c83.virtua.com.br[201.6.156.131] c9113d37.rjo.virtua.com.br[201.17.61.55] cable66a249.usuarios.retecal.es[213.254.66.249] cable73a151.usuarios.retecal.es[213.254.73.151] cam29.neoplus.adsl.tpnet.pl[83.30.84.29] catv-5063672e.catv.broadband.hu[80.99.103.46] cc84041-a.hnglo1.ov.home.nl[212.204.159.14] cpc2-darl2-5-1-cust168.midd.cable.ntl.com[82.6.207.168] cpc2-darl2-5-1-cust246.midd.cable.ntl.com[82.6.207.246] CPE000c4189a793-CM014500105533.cpe.net.cable.rogers.com[24.112.207.118] dial-1159.lubin.dialog.net.pl[62.87.209.135] dialup111.sofia.spnet.net[213.169.32.111] dialup117.sofia.spnet.net[213.169.32.117] dialup13.sofia.spnet.net[213.169.32.13] dialup63.nss.ltk.is.com.fj[202.62.120.110] dsl81-214-756.adsl.ttnet.net.tr[81.214.2.244] dsl81-215-11081.adsl.ttnet.net.tr[81.215.43.73] dsl81-215-12127.adsl.ttnet.net.tr[81.215.47.95] dsl81-215-12621.adsl.ttnet.net.tr[81.215.49.77] dsl81-215-43162.adsl.ttnet.net.tr[81.215.168.154] dsl81-215-5152.adsl.ttnet.net.tr[81.215.20.32] dsl81-215-5910.adsl.ttnet.net.tr[81.215.23.22] dsl81-215-6900.adsl.ttnet.net.tr[81.215.26.244] h000d56113fb3.ne.client2.attbi.com[24.131.134.86] h0010b568f3a3.ne.client2.attbi.com[24.61.154.36] h0040cab53019.ne.client2.attbi.com[24.91.135.81] jangce-1174.adsl.datanet.hu[195.56.12.158] M339P015.dipool.highway.telekom.at[62.46.32.79] modemcable171.52-130-66.mc.videotron.ca[66.130.52.171] modemcable179.240-203-24.mc.videotron.ca[24.203.240.179] modemcable225.184-201-24.mc.videotron.ca[24.201.184.225] modemcable227.134-203-24.mc.videotron.ca[24.203.134.227] modemcable245.107-70-69.mc.videotron.ca[69.70.107.245] modemcable254.86-201-24.mc.videotron.ca[24.201.86.254] mstr195175-29437.dial-in.ttnet.net.tr[195.175.194.254] mstr195175-30267.dial-in.ttnet.net.tr[195.175.198.60] n4z78l145.broadband.ctm.net[202.175.78.145] p3E9E8E7C.dip.t-dialin.net[62.158.142.124] p508360E9.dip0.t-ipconnect.de[80.131.96.233] p50836131.dip0.t-ipconnect.de[80.131.97.49] p50837496.dip0.t-ipconnect.de[80.131.116.150] p50837EAD.dip0.t-ipconnect.de[80.131.126.173] p5084BAE5.dip.t-dialin.net[80.132.186.229] p508A9316.dip0.t-ipconnect.de[80.138.147.22] p508CC432.dip0.t-ipconnect.de[80.140.196.50] p509134B9.dip.t-dialin.net[80.145.52.185] p5480DED9.dip.t-dialin.net[84.128.222.217] p548115DE.dip.t-dialin.net[84.129.21.222] p54878AEC.dip.t-dialin.net[84.135.138.236] pcp02171061pcs.brghtn01.mi.comcast.net[68.43.207.125] pcp02587825pcs.shlb1201.mi.comcast.net[68.84.168.99] pcp08413319pcs.savana01.ga.comcast.net[68.51.166.139] pD9523D7B.dip.t-dialin.net[217.82.61.123] pD953A06D.dip.t-dialin.net[217.83.160.109] pD95B65B3.dip0.t-ipconnect.de[217.91.101.179] pD9E578A1.dip.t-dialin.net[217.229.120.161] pD9EC4616.dip0.t-ipconnect.de[217.236.70.22] pD9EC50D1.dip0.t-ipconnect.de[217.236.80.209] pD9EC52FB.dip0.t-ipconnect.de[217.236.82.251] pD9FA8F9F.dip.t-dialin.net[217.250.143.159] rt-z-23c40.adsl.wanadoo.nl[81.70.90.64] S010600055d07eddd.gv.shawcable.net[24.108.127.244] S010600055dff39b9.vc.shawcable.net[24.87.42.137] S01060007e91f3b26.vc.shawcable.net[24.80.147.116] S01060010dca27adf.vn.shawcable.net[24.85.211.61] S01060050049395bf.gv.shawcable.net[24.108.153.242] S01060050229c08e8.vs.shawcable.net[24.81.90.251] S01060050bab21b9b.cg.shawcable.net[68.144.198.200] S01060050bf78aeb5.rd.shawcable.net[70.65.89.118] S01060050bfacf890.ok.shawcable.net[24.71.140.130] S01060080c6f85ba7.vf.shawcable.net[70.68.194.161] S010600e029961f94.gv.shawcable.net[24.68.6.189] user-0cej1dr.cable.mindspring.com[24.233.133.187] user-0cetn71.cable.mindspring.com[24.238.220.225] user-0cevf7e.cable.mindspring.com[24.239.188.238] user-12hc133.cable.mindspring.com[69.22.4.99] user242.res.openband.net[65.246.82.242] I personally use in some administrative bash scripts a very complex exten= ded=20 grep regexp (which is not a Perl regexp, sorry), that is also heuristic b= ut=20 shows very few mistakes. It bases its analysis on hostname[ip_address] as= =20 found in a Postfix log. Here is the regexp, maybe you can get some ideas from it, or it could be = of=20 some use to somebody ? : egrep -i "(^|[0-9.x_-])(((c|cm|h|host|m)?0*([1-9]{1,3}[0-9]{0,2}) [._-].*\[[.0-9]+\.\5\])|(abo|broadband|(hk)?cablep?|catv|d?client2?| cust(omer)?s?|dhcp|dial?(in|up)?|dip|[asx]?dsl|dyn(amic)?|home|in-addr| modem(cable)?|(di)?pool|ppp|ptr|rev|static|user|YahooBB[0-9]{12}|c[[:alnu= m:]] {6,}(\.[a-z]{3})?\.virtua|[1-9]Cust[0-9]+|ACB[0-9A-F]{5}\.ipt|pcp[0-9]{8}= pcs| S0106[[:alnum:]]{12,}\.[a-z]{2})[0-9.x_-]|unknown\[)" (All in 1 line ;-) You can test it against your own server logs by copying/pasting (on one s= ingle=20 line) an instruction as follows, for example: [root@totor sqlgrey]# zcat /var/log/mail/info.1 | egrep=20 "postfix/smtpd\[[0-9]+\]: connect from " | cut -f4- -d: | cut -f4- -d" " = |=20 sort -u | egrep -i "(^|[0-9.x_-])(((c|cm|h|host|m)?0*([1-9]{1,3}[0-9]{0,2= }) [._-].*\[[.0-9]+\.\5\])|(abo|broadband|(hk)?cablep?|catv|d?client2?| cust(omer)?s?|dhcp|dial?(in|up)?|dip|[asx]?dsl|dyn(amic)?|home|in-addr| modem(cable)?|(di)?pool|ppp|ptr|rev|static|user|YahooBB[0-9]{12}|c[[:alnu= m:]] {6,}(\.[a-z]{3})?\.virtua|[1-9]Cust[0-9]+|ACB[0-9A-F]{5}\.ipt|pcp[0-9]{8}= pcs| S0106[[:alnum:]]{12,}\.[a-z]{2})[0-9.x_-]|unknown\[)" | less Then you'll see all that it catches. Try the same, but with "egrep -iv" instead of "egrep -i" to check what it= does=20 NOT catch. --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E Etre dans le vent est une ambition de feuille morte ...ou de pet foireux. |
From: Lionel B. <lio...@bo...> - 2005-01-13 12:37:23
|
Rene Joergensen wrote the following on 01/13/2005 12:59 PM : >>1.5.x won't start until February as I'll be skiing from Saturday until >>then and 1.4.x still has some TODOs. >> >> > >Have you looked at the automatic whitelist updating? Or should i try >writing something using LWP. I guess the most reliable method is >fetching via HTTP and comparing MD5-sums afterwards. Or did you have >something different in mind? > > > I would have used the following : a new entry in sqlgrey.conf like "whitelist_rooturl = http://sqlgrey.bouton.name/whitelists" Then the update script (be it bash using wget, perl using LWP or whatever) will : - create a temporary directory in /tmp with mktemp -d, - fetch two md5 files, (named root_url/<whitelist_file>.md5), with timestamps (wget -N) compare them to the md5 in /etc/sqlgrey/, if one of them is newer (or there's no md5 in /etc/sqlgrey), continue, else abort, - fetch the missing whitelist files, compare md5, if successfull continue, else output an error (lets cron manage the mail handling), - optionnaly (new conf var: update_whitelist_showdiff != 0), show the diffs on the standard output and let cron send it to the admin, - move the whitelists and the *.md5 to /etc/sqlgrey, - send SIGUSR1 to the pid in /var/run/sqlgrey.pid, - cleanup the temp dir. I'll try to find the time to code this before Saturday and release 1.4.3. If people start hammering the poor whitelist server, I'll switch to the clamav way of managing this : use DNS to store a whitelist version. This has been proven quite efficient. Lionel. |
From: Rene J. <rg...@ba...> - 2005-01-13 11:59:48
|
On Thu, Jan 13, 2005 at 11:16:08AM +0100, Lionel Bouton wrote: > In 1.5.x there will be an index on connect.ip_addr. This is the most=20 > obvious one and greatly improved the load on a mysql database (see=20 > db_performance_reports). Yeah, i know ;-) > There could be some benefit for indices on the timestamp in the 3=20 > tables, but surprisingly it seems the cleanup process (which is the=20 > obvious target for timestamp related optimisations) didn't benefit much= =20 > from it when tested. As adding indices slow the database writes I'm not= =20 > considering adding timestamp ones yet (until someone brings test result= s=20 > where the db_cleanup is reduced from say 5s to 0.5s when adding one of=20 > them for example). I tried removing the index i had on connect.first_seen. It doesn't really make a big difference, cleaning now takes 1-2 seconds instead of 0-1 seconds before removing it. Mysqld doesn't seem to use more CPU and the queries in connect is executed in 0.00 seconds when performed manually. > 1.5.x won't start until February as I'll be skiing from Saturday until=20 > then and 1.4.x still has some TODOs. Have you looked at the automatic whitelist updating? Or should i try writing something using LWP. I guess the most reliable method is fetching via HTTP and comparing MD5-sums afterwards. Or did you have something different in mind? --=20 -Ren=E9 |
From: Lionel B. <lio...@bo...> - 2005-01-13 10:41:14
|
Michel Bouissou wrote the following on 01/13/05 11:06 : >This is also an option, but the decision wether or not to use the "full" >algorithm in not perfect. > Quite true, this is only an heuristic. > I've seen many cases where SQLgrey uses the "class >C" algorithm for end-user DSL addresses. The way different ISPs name their >end-user pools can vary quite a lot... > > Could you report them to me ? If possible I'd like to make SQLgrey aware of them. Lionel |
From: Lionel B. <lio...@bo...> - 2005-01-13 10:16:33
|
Rene Joergensen wrote the following on 01/13/05 10:48 : >It always nice to help enhancing a good piece of software/idea. > >Speaking of performance reports, do you need more info to be able to >decide on which indices to use? > > In 1.5.x there will be an index on connect.ip_addr. This is the most obvious one and greatly improved the load on a mysql database (see db_performance_reports). There could be some benefit for indices on the timestamp in the 3 tables, but surprisingly it seems the cleanup process (which is the obvious target for timestamp related optimisations) didn't benefit much from it when tested. As adding indices slow the database writes I'm not considering adding timestamp ones yet (until someone brings test results where the db_cleanup is reduced from say 5s to 0.5s when adding one of them for example). 1.5.x won't start until February as I'll be skiing from Saturday until then and 1.4.x still has some TODOs. Lionel. |
From: Michel B. <mi...@bo...> - 2005-01-13 10:06:48
|
Le Jeudi 13 Janvier 2005 10:55, Lionel Bouton a =E9crit : > > I thought of that and even discussed this very same idea on > postfix-users some time ago. I didn't have practical data to back my > claims. Good to know that this wasn't only theoretical. > There's a new thing to take into consideration since then : smart and > classc greylisting algorithms. > > The problem is that connect and awl entries now can reference whole > classc networks to cover for the farm of outgoing mailservers trying to > send the same e-mail. In this particular case, if they don't use the > same HELO string to connect (probably the case if they use their public > hostname), these algorithms are defeated. True. Then we might only try to match on the 1st level domain found in th= e=20 HELO, as it is highly probable that 2 servers in the same farm will be=20 "machinename1.subnet.provider.com" and "machinename2.subnet.provider.com"= ,=20 but they will for sure share the "provider.com" domain. But this would eliminate viruses that come once with "oemcomputer.com", a= nd=20 come back later using "oemcomputer.org", or spambots that come once with=20 "HELO qsdfgh.org" and later "HELO azerty.org" > We could use it when the 'full' algorithm is used or when there's no=20 > valid reverse DNS when the 'smartc' alogrithm is used. This is also an option, but the decision wether or not to use the "full"=20 algorithm in not perfect. I've seen many cases where SQLgrey uses the "cl= ass=20 C" algorithm for end-user DSL addresses. The way different ISPs name thei= r=20 end-user pools can vary quite a lot... > Added to my TODO, 1.5.x or latter. That will have to be tested carefull= y > though... I volunteer ;-) --=20 Michel Bouissou <mi...@bo...> OpenPGP ID 0xDDE8AC6E |