You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(90) |
Dec
(25) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(183) |
Feb
(124) |
Mar
(123) |
Apr
(75) |
May
(49) |
Jun
(60) |
Jul
(58) |
Aug
(41) |
Sep
(27) |
Oct
(30) |
Nov
(13) |
Dec
(19) |
2003 |
Jan
(119) |
Feb
(70) |
Mar
(5) |
Apr
(16) |
May
(3) |
Jun
(1) |
Jul
|
Aug
|
Sep
(1) |
Oct
(3) |
Nov
(4) |
Dec
(7) |
2004 |
Jan
(9) |
Feb
|
Mar
(1) |
Apr
(7) |
May
(12) |
Jun
(4) |
Jul
(11) |
Aug
(17) |
Sep
(3) |
Oct
(15) |
Nov
(7) |
Dec
(2) |
2005 |
Jan
(4) |
Feb
(7) |
Mar
(2) |
Apr
(2) |
May
|
Jun
(1) |
Jul
(3) |
Aug
(1) |
Sep
(9) |
Oct
(4) |
Nov
(1) |
Dec
|
2006 |
Jan
(5) |
Feb
(7) |
Mar
(19) |
Apr
(8) |
May
(6) |
Jun
(2) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
(1) |
2007 |
Jan
(1) |
Feb
|
Mar
(4) |
Apr
(2) |
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2008 |
Jan
|
Feb
(3) |
Mar
|
Apr
(1) |
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
(4) |
Nov
|
Dec
|
2009 |
Jan
(2) |
Feb
(2) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2012 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Simon W. <es...@ou...> - 2003-02-18 07:21:10
|
On Mon, 17 Feb 2003, Wizard wrote: > > May I suggest that you create a read_config routine that handles all of > > this data munging on start up. Possibly populating a hash with an array > > ref as value and blocked_users as the key. > > You could, but we're using a generic configuration mechanism > (cfg->get(key)), so I don't think it's practical. We're not using a lot of > arrays and they'll be mixed with strings with commas, so I don't want to > make it look for specific variable names to determine storage type. There > will also be user-defined configuration variables, so adding this level of > complexity could be dangerous. OK, that seems to do much the same thing. It just seemed to make sense to do all the config variable procesing in one place but... > > This way, the processing code need only deal with the actual data which > > makes any future updates to the config syntax simple to apply. No > > scrabbling around in the code. > > I don't see that a simple split into an array is going to confuse anyone. > When it's complete, take a look. If you think there is a better way, then > please let me know. ...I'll wait to see the finished code. I'm curious about these user defined variables and how they will be used. I'll wait to read the code/docs and comment further then. Simon.` |
From: Wizard <wi...@ne...> - 2003-02-18 00:25:56
|
> May I suggest that you create a read_config routine that handles all of > this data munging on start up. Possibly populating a hash with an array > ref as value and blocked_users as the key. You could, but we're using a generic configuration mechanism (cfg->get(key)), so I don't think it's practical. We're not using a lot of arrays and they'll be mixed with strings with commas, so I don't want to make it look for specific variable names to determine storage type. There will also be user-defined configuration variables, so adding this level of complexity could be dangerous. > This way, the processing code need only deal with the actual data which > makes any future updates to the config syntax simple to apply. No > scrabbling around in the code. I don't see that a simple split into an array is going to confuse anyone. When it's complete, take a look. If you think there is a better way, then please let me know. Grant M. |
From: Simon W. <es...@ou...> - 2003-02-17 23:13:41
|
On Mon, 17 Feb 2003, Wizard wrote: > > > foreach( split /\,\s*/, $blocked_users ) { > > > push @bad_users, $_; > > > } > > > > Why not use a list rather than a scalar, it would be consistent with our > > other config directives and save on this step. > > $blocked_users is read externally from the configuration file, like so: > === /.conf/.nms.cfg ===================================== > Blocked_users = fr...@fr..., ch...@ch..., ox...@bi... > =========================================================== > That's what the split is for. It could've been space separated, but I > thought it would be more readable this way. May I suggest that you create a read_config routine that handles all of this data munging on start up. Possibly populating a hash with an array ref as value and blocked_users as the key. This way, the processing code need only deal with the actual data which makes any future updates to the config syntax simple to apply. No scrabbling around in the code. > I actually plan on profiling the whole thing, over time. I don't plan on > anything more than looking for bottlenecks for the first round. Fairy nuff :) Simon. |
From: Wizard <wi...@ne...> - 2003-02-17 19:17:08
|
> > foreach( split /\,\s*/, $blocked_users ) { > > push @bad_users, $_; > > } > > Why not use a list rather than a scalar, it would be consistent with our > other config directives and save on this step. $blocked_users is read externally from the configuration file, like so: === /.conf/.nms.cfg ===================================== Blocked_users = fr...@fr..., ch...@ch..., ox...@bi... =========================================================== That's what the split is for. It could've been space separated, but I thought it would be more readable this way. > I'd probably throw a 'last' in here. Only one match is needed. Then > again, see note about premature optimisation below :) good point. done. > Yeah, no point in premature optimisation. Compared to the overhead of > compiling the cgi script I doubt anything is going to make that much > difference to the overall speed of execution... I actually plan on profiling the whole thing, over time. I don't plan on anything more than looking for bottlenecks for the first round. Thanks, Grant M. |
From: Simon W. <es...@ou...> - 2003-02-17 16:40:13
|
On Mon, 2003-02-17 at 16:16, Wizard wrote: > That's fine, 2 to 1 against. I've replaced the code with: > > foreach( split /\,\s*/, $blocked_users ) { > push @bad_users, $_; > } Why not use a list rather than a scalar, it would be consistent with our other config directives and save on this step. > $user =~ s/^\s*(.*)\s*/$1/; > @fail = ( 0, "ALLOWED" ); > foreach( @bad_users ) { > $_ =~ s/^\s*(.*)\s*/$1/; > if( (length( $_ ) == length( $user )) && index( $user, $_) != -1 ) { > @fail = ( 1, "$_" ); I'd probably throw a 'last' in here. Only one match is needed. Then again, see note about premature optimisation below :) > } > } > return @fail; > > This will require an exact match between the blocked string and the entered > email address. It returns an array of: > ( 1|0, "ALLOWED"|"string_matched" ) Looks OK to me. > I plan on profiling this against a grep, but I suspect that this is faster. Yeah, no point in premature optimisation. Compared to the overhead of compiling the cgi script I doubt anything is going to make that much difference to the overall speed of execution... Simon. |
From: Wizard <wi...@ne...> - 2003-02-17 16:22:01
|
That's fine, 2 to 1 against. I've replaced the code with: foreach( split /\,\s*/, $blocked_users ) { push @bad_users, $_; } $user =~ s/^\s*(.*)\s*/$1/; @fail = ( 0, "ALLOWED" ); foreach( @bad_users ) { $_ =~ s/^\s*(.*)\s*/$1/; if( (length( $_ ) == length( $user )) && index( $user, $_) != -1 ) { @fail = ( 1, "$_" ); } } return @fail; This will require an exact match between the blocked string and the entered email address. It returns an array of: ( 1|0, "ALLOWED"|"string_matched" ) I plan on profiling this against a grep, but I suspect that this is faster. Grant M. > -----Original Message----- > From: nms...@li... > [mailto:nms...@li...]On Behalf Of Nick > Cleaton > Sent: Monday, February 17, 2003 2:27 AM > To: Wizard > Cc: Simon Wilcox; nms-devel > Subject: Re: Organisation filtering RE: [Nms-cgi-devel] > Anyone...Anyone...? > > > On Fri, Feb 14, 2003 at 07:12:27AM -0500, Wizard wrote: > > > > Please, ANYONE, feel free to say "Grant, you're an idiot!". As > long as you > > don't call my mother names, I'm pretty open to criticism ;-) > > > > Grant, you're committing too much effort on a peripheral feature. Not > just development effort now, but support and development effort in the > future as well. > > I don't think the feature is worth the complexity it will add to code, > documentation and testing. > > -- > Nick > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Nms-cgi-devel mailing list > Nms...@li... > https://lists.sourceforge.net/lists/listinfo/nms-cgi-devel > |
From: Nick C. <ni...@cl...> - 2003-02-17 09:13:41
|
On Fri, Feb 14, 2003 at 07:12:27AM -0500, Wizard wrote: > > Please, ANYONE, feel free to say "Grant, you're an idiot!". As long as you > don't call my mother names, I'm pretty open to criticism ;-) > Grant, you're committing too much effort on a peripheral feature. Not just development effort now, but support and development effort in the future as well. I don't think the feature is worth the complexity it will add to code, documentation and testing. -- Nick |
From: Wizard <wi...@ne...> - 2003-02-14 12:17:37
|
> Which is why I suggested splitting the configuration/processing > so that it > is easier to understand and predict the outcome of the process. > > Given the amount of effort going into supporting (Form|TF)Mail I don't > think we should be creating anything that is likely to add to that burden. Well, NMSBoard WITHOUT email filtering will likely add to that burden substantially. In this first very-limited version we'll be adding IP filtering, templating, optional threading/Guestbook formats, moderation, and external configurations with optional <<HERE formats. That's enough to make anyone sweat :) > Well, I have failed or we wouldn't be having this discussion :) If you failed, we wouldn't be having ANY discussion :) I'm definitely not trying to dismiss your ideas or concerns, I'm just not as concerned about the possible repercussions as you seem to be. It's kind of hard to judge when there are only two us involved in the discussion. Please, ANYONE, feel free to say "Grant, you're an idiot!". As long as you don't call my mother names, I'm pretty open to criticism ;-) > Be wary of this approach. If we pull it because users are struggling, the > project will get a bad reputation from the start both from those that > couldn't get on with it and those who actually liked it and felt cheated > when we take it out. Not good. That's worse-case, but I don't expect it to come to that. I think once we get a knowledgebase going a substantial amount of the support may go away as well. It would be nice however, if another person would chime in and call us names or something, so that we have some point of reference other than our own (this tunnel-vision tends to make one nauseas after a spell, no? ;-) > Playing devils advocate for a moment, can you actually point to an email > that requests an allow/deny mechanism based on the organisation of the > domain and independent of the hierarchy, e.g. aol.co.uk and aol.com *in > one filter* ? No, I can't cite anything more than a request to block users from posting (which this system should do without a hitch). Beyond that, the rest is just my own attempt at fluff. Potentially dangerous fluff, but fluff just the same :) I'm mostly basing my ideas on problems encountered on my friend's Coonhound Central WWWBoard, where I had added similar functionality (IP address & email filters). Come to think of it, I think I'll try to dig up that code and see what I did there (it was back in '97 or so). Thanks again, Grant M. |
From: Simon W. <es...@ou...> - 2003-02-14 08:46:36
|
On Thu, 13 Feb 2003, Wizard wrote: > > I just think that is going to be nigh on impossible to test and/or > > document it so that it makes sense to our users, that frankly struggle (in > > some instances) with *very* *simple* configurations in FormMail and so on. > > I agree to some extent (for either of our implementations). What I was > suggesting was simply a warning that, because of the complexities in email > address hierarchy, that the system may not always work as expected. I > suspect that there will be questions such as "x@x.x.x is being filtered and > they shouldn't be" and we'd have to tell them what to do to fix it. Which is why I suggested splitting the configuration/processing so that it is easier to understand and predict the outcome of the process. Given the ammount of effort going into supporting (Form|TF)Mail I don't think we should be creating anything that is likely to add to that burden. > > I'm sorry that I've failed to convince you of the dangers I see in this > > approach and the risk that we will create something that is almost > > impossible to test and that will produce unpredictable results, at least > > for the class of users we expect to use it. > > You haven't necessarily failed. I think I understand the dangers, but > believe that they could be less severe than you suggest. The option is to > not include it at all, which would be all right with me. I just felt that is > was worth a shot to see what sort of feedback it offers from users. I don't > believe that we are doing anything that is inherently 'broken', just limited > in its scope. Well, I have failed or we wouldn't be having this discussion :) > > Go for what you think is best and when it's ready we'll test it. > > I'll include what I have, and perhaps we'll get some more feedback once > people start setting it up. Worse case scenario, we yank it back out, or > offer it as an unsupported add-in. Be wary of this approach. If we pull it because users are strugging, the project will get a bad reputation from the start both from those that couldn't get on with it and those who actually liked it and felt cheated when we take it out. Not good. > > Am I barking up the wrong tree entirely ? > > No, as far as I can tell, you're concerns are valid. I just felt that it > would be better to attempt to offer something in response to the request, > rather than offer nothing. Playing devils advocate for a moment, can you actually point to an email that requests an allow/deny mechanism based on the organisation of the domain and independent of the hierarchy, e.g. aol.co.uk and aol.com *in one filter* ? Simon. |
From: Wizard <wi...@ne...> - 2003-02-14 03:01:49
|
> I just think that is going to be nigh on impossible to test and/or > document it so that it makes sense to our users, that frankly struggle (in > some instances) with *very* *simple* configurations in FormMail and so on. I agree to some extent (for either of our implementations). What I was suggesting was simply a warning that, because of the complexities in email address hierarchy, that the system may not always work as expected. I suspect that there will be questions such as "x@x.x.x is being filtered and they shouldn't be" and we'd have to tell them what to do to fix it. > I'm sorry that I've failed to convince you of the dangers I see in this > approach and the risk that we will create something that is almost > impossible to test and that will produce unpredictable results, at least > for the class of users we expect to use it. You haven't necessarily failed. I think I understand the dangers, but believe that they could be less severe than you suggest. The option is to not include it at all, which would be all right with me. I just felt that is was worth a shot to see what sort of feedback it offers from users. I don't believe that we are doing anything that is inherently 'broken', just limited in its scope. > You're the lead on the project and I am short of CFT so I'm reluctant to > write code that isn't going to be used. I prefer designing first and > coding second (well, third actually, after tests :) whereas you're jumping > straight into code which is I think causing some of the confusion. I actually prefer long naps with dreams of software that KWIWADI (Knows What I Want And Does It :). > Go for what you think is best and when it's ready we'll test it. I'll include what I have, and perhaps we'll get some more feedback once people start setting it up. Worse case scenario, we yank it back out, or offer it as an unsupported add-in. > Am I barking up the wrong tree entirely ? No, as far as I can tell, you're concerns are valid. I just felt that it would be better to attempt to offer something in response to the request, rather than offer nothing. Thanks, Grant M. |
From: Simon W. <es...@ou...> - 2003-02-14 01:02:13
|
On Thu, 13 Feb 2003, Wizard wrote: > > grep the domain for a blocked_org > > if successful, check the tld. > > if generic check the second level domain > > What determines "generic"? A list of gTLDs & ccTLDs? or something else? Actually I realised that it could be simplified a bit: domain contains a blocked_org { blocked_org in 2nd level { deny }else ccTLD and blocked_org in 3rd level and generic 2nd level { deny } } accept This covers ccTLDs without generic 2nd levels, such as .fi and those that do (.uk), as well as the gTLDs. To answer your specific question, generic as in gTLD. > > The caveat would be if there were a legitimate use of, say, .com under a > > ccTLD that uses .co with a host of aol, e.g. aol.com.uk would be > > blocked. > > That does fail on my tests, but so far it's the only scenario that I have > found that doesn't do what I expect. Unfortunately, there are actually > domains like that: br...@ne..., in...@ne..., ac...@ac..., etc. A further check could be added that would force a second element into an Organisation role when there are only two elements. > > I *still think* that it's too open to getting wrong though as it will > > only work for pan global companies who have registered their name in > > every ccTLD. > > I really don't see that we can ever make this completely fool-proof, given > what we are dealing with. So far, the *.ac.dk is the only scenario that I > have to indicate where my test doesn't DWIM (and from what you're saying, > yours do to). Not to say that it won't in numerous other situations, but I > haven't seen it yet, so I'm not clear regarding your concern about it's > workings. I understand you feel there is some issue here, and I'm not > arguing with you, but I guess I'm just not clear on where the issue lies (or > what the difference is). I just think that is going to be nigh on impossible to test and/or document it so that it makes sense to our users, that frankly struggle (in some instances) with *very* *simple* configurations in FormMail and so on. I think that we will end up with too many weird edge cases that will just confuse people and/or a maintenance nightmare of trying to code around them when they turn up. I'm sorry that I've failed to convince you of the dangers I see in this approach and the risk that we will create something that is almost impossible to test and that will produce unpredictable results, at least for the class of users we expect to use it. > Perhaps, if you could patch together some code when you have a moment, I'll > be glad to try it out. If you can't, my code is just a subroutine that can > easily be replaced/removed prior to release (returns a 1|0). > Let me know, > Grant M. You're the lead on the project and I am short of CFT so I'm reluctant to write code that isn't going to be used. I prefer designing first and coding second (well, third actually, after tests :) whereas you're jumping straight into code which is I think causing some of the confusion. Go for what you think is best and when it's ready we'll test it. Note to all - can someone pitch in with views on this please. Am I barking up the wrong tree entirely ? Simon. |
From: Wizard <wi...@ne...> - 2003-02-13 22:10:44
|
Sorry, my power went out and I was using up my UPS, so I had to wait to reply. > > 'Organization' - that's the work I was looking for! (note: > Yanks use 'z') > > That's 'cos you can't speel :) We don't need to - no one can really understand us anyway :). <<snip>> > > grep the domain for a blocked_org > if successful, check the tld. > if generic check the second level domain What determines "generic"? A list of gTLDs & ccTLDs? or something else? > The caveat would be if there were a legitimate use of, say, .com under a > ccTLD that uses .co with a host of aol, e.g. aol.com.uk would be > blocked. That does fail on my tests, but so far it's the only scenario that I have found that doesn't do what I expect. Unfortunately, there are actually domains like that: br...@ne..., in...@ne..., ac...@ac..., etc. > I *still think* that it's too open to getting wrong though as it will > only work for pan global companies who have registered their name in > every ccTLD. I really don't see that we can ever make this completely fool-proof, given what we are dealing with. So far, the *.ac.dk is the only scenario that I have to indicate where my test doesn't DWIM (and from what you're saying, yours do to). Not to say that it won't in numerous other situations, but I haven't seen it yet, so I'm not clear regarding your concern about it's workings. I understand you feel there is some issue here, and I'm not arguing with you, but I guess I'm just not clear on where the issue lies (or what the difference is). Perhaps, if you could patch together some code when you have a moment, I'll be glad to try it out. If you can't, my code is just a subroutine that can easily be replaced/removed prior to release (returns a 1|0). Let me know, Grant M. |
From: Simon W. <es...@ou...> - 2003-02-13 17:28:52
|
On Thu, 2003-02-13 at 16:57, Wizard wrote: > > .co is not a valid tld, either generic or country code. See > > http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1/db_en.html > > > > .bl is not a tld in this example, uk is. It happens that the British > > Library for reasons dating back to antiquity, sit outside the generally > > accepted standard of .(co|ac|org).uk > > > > aol.police.uk is really going to mess you up :) > > > > I still believe that you cannot parse the ORGANISATION out of the domain > > part of the email address without more semantic information than you > > have available. > > 'Organization' - that's the work I was looking for! (note: Yanks use 'z') That's 'cos you can't speel :) > I never thought that I would be able to filter every possible email without > at least a few failures. My only real goal was to offer the best possible > solution to the problem with the least amount of complexity for the user. > > As far as '.bl', I'm still not sure what sort of situation we're talking > about. If it is some sort of generic domain like '.co' (standard or not) > then we can add it to the $valid_tlds variable. It apparently doesn't matter > if it conflicts or not, it will still work (.co is actually a ccTLD for > Columbia). Oops, so it is. My list was sorted in the wrong order :-/ [snip good stuff] OK. Here's a suggestion. Let's have a separate list of blocked organisations that attempts to do what you suggest. Lot's of documentation to caveat the fact that things might not DWYM for lesser known organisations. i.e. it works on the basis of looking at 2nd and 3rd level domain names and assumes they are all owned by the same organisation. This will work for, say, aol but not for domains like simonwilcox.com which is not the same organisation as simonwilcox.org (unfortunately :) The list looks something like: @blocked_orgs = qw/ aol microsoft foo /; There is a list of common 2nd level tlds, such as co ac org etc. For each email address : grep the domain for a blocked_org if successful, check the tld. if generic check the second level domain if successful - it's blocked else if the second level matches a common generic 2nd level check the 3rd level if successful - it's blocked it succeeds The block_filters as defined would continue to work. The caveat would be if there were a legitimate use of, say, .com under a ccTLD that uses .co with a host of aol, e.g. aol.com.uk would be blocked. I think that *that's* sufficiently unlikely to be worth a go :) I *still think* that it's too open to getting wrong though as it will only work for pan global companies who have registered their name in every ccTLD. I suspect that most users will use the straightforward filters but separating them out will : a) make it clearer what's going on b) simplify the algorithms (maintainability++) c) be faster (good thing in cgi scripts :) Thoughts ? Simon. |
From: Wizard <wi...@ne...> - 2003-02-13 17:03:11
|
> .co is not a valid tld, either generic or country code. See > http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1/db_en.html > > .bl is not a tld in this example, uk is. It happens that the British > Library for reasons dating back to antiquity, sit outside the generally > accepted standard of .(co|ac|org).uk > > aol.police.uk is really going to mess you up :) > > I still believe that you cannot parse the ORGANISATION out of the domain > part of the email address without more semantic information than you > have available. 'Organization' - that's the work I was looking for! (note: Yanks use 'z') I never thought that I would be able to filter every possible email without at least a few failures. My only real goal was to offer the best possible solution to the problem with the least amount of complexity for the user. As far as '.bl', I'm still not sure what sort of situation we're talking about. If it is some sort of generic domain like '.co' (standard or not) then we can add it to the $valid_tlds variable. It apparently doesn't matter if it conflicts or not, it will still work (.co is actually a ccTLD for Columbia). However, it appears as though it is the actual organization name though (www.bl.uk), in which case fr...@ao... shouldn't be filtered by *@*.aol.*. What I am seeing in fr...@ao... is a machine name 'aol' within the bl.uk organization. On the other hand, aol.co.uk is actually the mail target for the aol.co.uk organization, where .co is a generic class id (or whatever it should be called) similar to .org or .com as a gTLD. > You cannot expect to have *.aol.* block just websites belonging to > AOL/Time Warner. I don't, but I do expect it to at least try to determine if it is an organization name, cname, gTLD/ccTLD or other type of qualifier (like .co|.ac). When a user enters *@aol.*, I am assuming that their target is the organization 'aol', not a machine name. I think that the script should assume that as well. Typically, I would suspect that most filter globs will either target a particular user, an organization, or perhaps a country. That is what the mechanism is designed to address. The system doesn't make the machine name a first choice, but rather it's last choice. That is what I would expect, and I have assumed that of the users of the script. I am not saying that I'm not mistaken with this assumption. (I am very good at being wrong ;-). > You might work on something that allows *.aol.*.uk or *.aol.*.* to match > in the third level domain. Let me know what you get, but that's the problem I was struggling with. How do you do that without filtering nearly anything that contains that value? For instance: FILTER: *@news.*.* -> meaning 'fr...@ne...' Email: fr...@ne... whereas with the present system, *@news.* will assume 'news' to be the organization name and not the machine name and will pass 'fr...@ne...' but deny 'fr...@ne...' I just got your email, I'll take a look. Grant M. |
From: Simon W. <es...@ou...> - 2003-02-13 16:53:37
|
OK, Here's my attempt at a generic pattern matching email filter. It does NOT attempt any cleverness about what organisation the domain represents. I haven't broken it up into module and test code at this stage although you should be able to rip out the test code and stick a package header on it without too much trouble. Fire at will... :) Simon. |
From: Simon W. <es...@ou...> - 2003-02-13 15:59:18
|
On Thu, 2003-02-13 at 15:26, Wizard wrote: > > I *think* that fr...@ao... should have been blocked by *@*.aol.* but > > it doesn't seem to match on all fields. > > If '.bl' is a valid tld (like '.co') then I'll add it to the $valid_tlds > variable and it should parse correctly. I wasn't aware of it. If it > conflicts with a country code, then I will have a problem. .co is not a valid tld, either generic or country code. See http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1/db_en.html .bl is not a tld in this example, uk is. It happens that the British Library for reasons dating back to antiquity, sit outside the generally accepted standard of .(co|ac|org).uk aol.police.uk is really going to mess you up :) I still believe that you cannot parse the ORGANISATION out of the domain part of the email address without more semantic information than you have available. You cannot expect to have *.aol.* block just websites belonging to AOL/Time Warner. You might work on something that allows *.aol.*.uk or *.aol.*.* to match in the third level domain. In fact, I'm doing this right now. I just need to test it... Simon. |
From: Wizard <wi...@ne...> - 2003-02-13 15:43:32
|
> Actually, not according to the rules. *.aol.* means just that: > Domain equal > to 'aol' with ANY TLD, not just commercial TLDs. Just to clarify, that is gTLDs, not ccTLDs. I did just check, and it doesn't conflict, as there is no .bl ccTLD. Grant M. |
From: Wizard <wi...@ne...> - 2003-02-13 15:36:03
|
> fr...@ao... matches everything in *@*,aol.* but it should pass. Actually, not according to the rules. *.aol.* means just that: Domain equal to 'aol' with ANY TLD, not just commercial TLDs. It will block .org, .edu, .museum, .name, .co.uk, etc., anything that is in the $valid_tlds variable. |
From: Wizard <wi...@ne...> - 2003-02-13 15:31:46
|
> I really struggled to understand what was going on there. I'm still not > sure I get it. I struggled too ;-). It tries to coax the DOMAIN (i.e., aol, sourceforge, slashdot, theregister, etc.) from the string and parse it into the $dn variable. That's what allows the 'aol' from 'fred.aol.co.uk' to end up in the right field when comparing to '*.aol.*'. I didn't say it would be elegant ;-) > If it goes in, the error message should say which filter rejected the > email address. Ok, that's easy enough. > I *think* that fr...@ao... should have been blocked by *@*.aol.* but > it doesn't seem to match on all fields. If '.bl' is a valid tld (like '.co') then I'll add it to the $valid_tlds variable and it should parse correctly. I wasn't aware of it. If it conflicts with a country code, then I will have a problem. Grant M. |
From: Simon W. <es...@ou...> - 2003-02-13 14:36:38
|
Bad form replying to your own posts but... On Thu, 2003-02-13 at 14:23, Simon Wilcox wrote: > > I *think* that fr...@ao... should have been blocked by *@*.aol.* but > it doesn't seem to match on all fields. fr...@ao... matches everything in *@*,aol.* but it should pass. Simon. |
From: Simon W. <es...@ou...> - 2003-02-13 14:23:37
|
On Thu, 2003-02-13 at 13:16, Wizard wrote: > Did anyone get a chance to break the test script I emailed to this list (see > my last message)? I'm hoping to get something posted for NMSBoard by > tomorrow. If I include the filtering, I plan on LOTS of warnings in the docs > stating that email filtering is 'experimental' and to "use it at your own > risk of being flamed". > Let me know if there is any dissent, I really struggled to understand what was going on there. I'm still not sure I get it. If it goes in, the error message should say which filter rejected the email address. I *think* that fr...@ao... should have been blocked by *@*.aol.* but it doesn't seem to match on all fields. Simon. |
From: Wizard <wi...@ne...> - 2003-02-13 13:21:50
|
Did anyone get a chance to break the test script I emailed to this list (see my last message)? I'm hoping to get something posted for NMSBoard by tomorrow. If I include the filtering, I plan on LOTS of warnings in the docs stating that email filtering is 'experimental' and to "use it at your own risk of being flamed". Let me know if there is any dissent, Grant M. |
From: Wizard <wi...@ne...> - 2003-02-12 15:37:05
|
Here's a bit of something (attached) that seems to work for everything that I've tried. It doesn't thoroughly check the CNAMES yet, but that should be academic. Just pass the email that you want to check (or manually enter it in the $user variable) and put your filter strings in the $users string (escaped). Please let me know when/how it doesn't DWIM or if you otherwise manage to break it. Also, if you can see any possible exploits of this code, that'd be good to know as well (the email address will be form submitted CGI data). Grant M. |
From: Wizard <wi...@ne...> - 2003-02-12 13:50:02
|
> Something like this maybe: > > info@dk -> /^info@dk$/ > > *@dk -> /@dk$/ > > *@*.kr -> /\bkr$/ > > fred@*.spammers.com -> /^fred@.*\bspammers.com$/ > > In this case, you treat the period in the filter syntax as meaning word > boundary and munge into the regex accordingly. The first two work, which is what I meant by "post-@ full-string matches", but with the last two we have unintended consequences: #3 will match fred@bi-kr (yes, I know it's not necessarily valid) #4 will match fr...@no... I do realize that there are likely solutions to these issues, as there are solutions to most of the issues discussed. But that being the case, why shouldn't there also be solutions to the problems that I am encountering in my implementation? In fact (although I'm not positive), I believe that the system that I have now will interpret all of these examples correctly. I plan on completing a test script this morning. Let me know if I should post it to the list for testing. Grant M. |
From: Simon W. <es...@ou...> - 2003-02-12 13:08:27
|
On Wed, 2003-02-12 at 12:51, Wizard wrote: > > I still think it is a requirement to block whole sub-domains, for > > instance *@*.kr or *@*.spammers-r-us.com > > I'm not sure that we differ on that, but there could be some problems with > it either way: how does the second example treat the example given yesterday > of info@dk? or fr...@sp...? Depending on how it parses, we could > block neither or both. Is it better to fail conservatively, aggressively, or > not at all? These are all things I'm trying to consider (my brain hurts). We don't but your message seemed to say full string matches and didn't cover these case. I think the syntax for specifying these filters needs to be domain specific and most definitely NOT regexes. It's too easy for users to get these wrong. I would expect the code to turn the syntax into regexes. Something like this maybe: info@dk -> /^info@dk$/ *@dk -> /@dk$/ *@*.kr -> /\bkr$/ fred@*.spammers.com -> /^fred@.*\bspammers.com$/ In this case, you treat the period in the filter syntax as meaning word boundary and munge into the regex accordingly. S. |