From: Fabien M. <fab...@by...> - 2013-03-08 14:50:41
|
Dear all, I'm new user of fail2ban and I've a problem with it. I active in jail.conf the [apache-badbots] section. In my Apache logs, I've this: 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Here my fail2ban configuration: /etc/fail2ban/jail.conf [apache-badbots] enabled = true filter = apache-badbots action = iptables-multiport[name=BadBots, port="http,https"] sendmail-buffered[name=BadBots, lines=5, dest=yo...@ex...] logpath = /var/log/httpd/website_access.log bantime = 172800 maxretry = 5 /etc/fail2ban/filter.d/apache-badbots.conf badbots = Mozilla/5\.0 \(compatible; Googlebot/2\.1; +http\://www\.google\.com/bot\.html\) failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ And when I try this regex with this command, I've not match... And I don't know why it didn't find anything... fail2ban-regex /var/log/httpd/website_access.log /etc/fail2ban/filter.d/apache-badbots.conf Results ======= Failregex |- Regular expressions: | [1] ^<HOST> -.*"(GET|POST).*HTTP.*"(?:User-Agent\: Mozilla/5\.0 \(compatible; MJ12bot/v1\.4\.3; http\://www\.majestic12\.co\.uk/bot\.php\?+\)|EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider)"$ | `- Number of matches: [1] 0 match(es) Somebody can help me to solve this? Thanks in advance! Fabien -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun |
From: <al...@ma...> - 2013-03-08 16:05:13
|
Fabien, Have you considered using robots.txt to stop Google from crawling your site? It's pretty easy thing to do, just read http://robotstxt.org I can see that in your regex you also excluding our bot (MJ12bot), why are you doing this when robots.txt can be used to control us and Googlebot? Have you contacted us directly with any problems related to our bot? I don't seem to remember seeing any messages from you, even though we've got dedicated bot page with contact email right on top of it. And by the way did you ask permission of your customers hosting at bysoft-hosting.com for you to implement such search engine blocks behind their backs? This can cause your customer sites to be delisted from Google if you block them from accessing those hosted sites. Is your company management aware of what you are doing to their own paying customer base? These are pretty serious questions that you need to consider. Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom -----Original Message----- From: Fabien MORCAMP [mailto:fab...@by...] Sent: 08 March 2013 14:50 To: fai...@li... Subject: [Fail2ban-users] fail2ban no match with regex Dear all, I'm new user of fail2ban and I've a problem with it. I active in jail.conf the [apache-badbots] section. In my Apache logs, I've this: 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Here my fail2ban configuration: /etc/fail2ban/jail.conf [apache-badbots] enabled = true filter = apache-badbots action = iptables-multiport[name=BadBots, port="http,https"] sendmail-buffered[name=BadBots, lines=5, dest=yo...@ex...] logpath = /var/log/httpd/website_access.log bantime = 172800 maxretry = 5 /etc/fail2ban/filter.d/apache-badbots.conf badbots = Mozilla/5\.0 \(compatible; Googlebot/2\.1; +http\://www\.google\.com/bot\.html\) failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ And when I try this regex with this command, I've not match... And I don't know why it didn't find anything... fail2ban-regex /var/log/httpd/website_access.log /etc/fail2ban/filter.d/apache-badbots.conf Results ======= Failregex |- Regular expressions: | [1] ^<HOST> -.*"(GET|POST).*HTTP.*"(?:User-Agent\: Mozilla/5\.0 |\(compatible; MJ12bot/v1\.4\.3; |http\://www\.majestic12\.co\.uk/bot\.php\?+\)|EmailCollector|WebEMailEx |trac|TrackBack/1\.02|sogou music spider)"$ | `- Number of matches: [1] 0 match(es) Somebody can help me to solve this? Thanks in advance! Fabien -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users |
From: Fabien M. <fab...@by...> - 2013-03-08 16:23:51
|
Hi Alex, Thanks for your reply. But I install fail2ban to enjoy his powerful management. I know I can use robots.txt but I want to use fail2ban. If this module have been created, it's to use it! I didn't contact you about your bot cause it just a try. When I see my logs, I see your bots, so I try the regex on as it could be another! Behind their backs?? I didn't do anything behind their back! I try to solve a client which ask me about that. So I do it. I know what I do and what can I do. But I'm here to find a solution. This is an implemented module and I would like to know how to use it correctly. Maybe it's come from my regex with \ which are needed or not... Hope somebody can help me on. Fabien ----- Original Message ----- From: al...@ma... To: fai...@li... Sent: Vendredi 8 Mars 2013 15:59:47 Subject: Re: [Fail2ban-users] fail2ban no match with regex Fabien, Have you considered using robots.txt to stop Google from crawling your site? It's pretty easy thing to do, just read http://robotstxt.org I can see that in your regex you also excluding our bot (MJ12bot), why are you doing this when robots.txt can be used to control us and Googlebot? Have you contacted us directly with any problems related to our bot? I don't seem to remember seeing any messages from you, even though we've got dedicated bot page with contact email right on top of it. And by the way did you ask permission of your customers hosting at bysoft-hosting.com for you to implement such search engine blocks behind their backs? This can cause your customer sites to be delisted from Google if you block them from accessing those hosted sites. Is your company management aware of what you are doing to their own paying customer base? These are pretty serious questions that you need to consider. Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom -----Original Message----- From: Fabien MORCAMP [mailto:fab...@by...] Sent: 08 March 2013 14:50 To: fai...@li... Subject: [Fail2ban-users] fail2ban no match with regex Dear all, I'm new user of fail2ban and I've a problem with it. I active in jail.conf the [apache-badbots] section. In my Apache logs, I've this: 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Here my fail2ban configuration: /etc/fail2ban/jail.conf [apache-badbots] enabled = true filter = apache-badbots action = iptables-multiport[name=BadBots, port="http,https"] sendmail-buffered[name=BadBots, lines=5, dest=yo...@ex...] logpath = /var/log/httpd/website_access.log bantime = 172800 maxretry = 5 /etc/fail2ban/filter.d/apache-badbots.conf badbots = Mozilla/5\.0 \(compatible; Googlebot/2\.1; +http\://www\.google\.com/bot\.html\) failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ And when I try this regex with this command, I've not match... And I don't know why it didn't find anything... fail2ban-regex /var/log/httpd/website_access.log /etc/fail2ban/filter.d/apache-badbots.conf Results ======= Failregex |- Regular expressions: | [1] ^<HOST> -.*"(GET|POST).*HTTP.*"(?:User-Agent\: Mozilla/5\.0 |\(compatible; MJ12bot/v1\.4\.3; |http\://www\.majestic12\.co\.uk/bot\.php\?+\)|EmailCollector|WebEMailEx |trac|TrackBack/1\.02|sogou music spider)"$ | `- Number of matches: [1] 0 match(es) Somebody can help me to solve this? Thanks in advance! Fabien -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun |
From: <al...@ma...> - 2013-03-08 16:58:45
|
Dear Fabian, > I know I can use robots.txt but I want to use fail2ban. Why would you call legitimate crawlers that support robots.txt standard as "bad bots" and block them in a way that does not give information to the bot writers that the sites should not be crawled? Bad bots are normally considered as those who do not follow robots.txt standard, and especially those who fake user-agents to pretend to be browser or some other crawler. To the best of our knowledge Google does not do it, and neither do we. > Behind their backs?? I didn't do anything behind their back! I try to solve a client which > ask me about that. So I do it. I know what I do and what can I do. I don't know who your client is, but if your client is the hosting company that wants to implement such block for sites hosted on behalf of their customers without informing customers then it's exactly "doing behind their backs". We've come across with such BAD behaviour from time to time where hosting provider did it to their customers without their knowledge, we know that for fact because their customers tried to verify their site on our system to get free data and could not do so because of blocks done behind their back. Blocking search engine bot will result in it being unable to get content and that can result in search engine rankings to go down, this can bankrupt small businesses who would not have a clue as to why it happened - they'll just see drop in traffic from Google, that assumes they have analytical system to show it, which for many small companies is way too advanced thing to use. This is particularly true if the block is done on FIREWALL level which would make site look down for the search engine, common practice in such cases is to stop showing it in search results because it can't be visited. Tell your client about these risks. Tell your client to use robots.txt to control good bots that obey it - this includes Googlebot and MJ12bot. Think about it before "enjoying powerful management" of a tool that can be extremely dangerous to peoples livelihoods. > Hope somebody can help me on. I've given you extremely useful advice that you would not get otherwise from this newsgroup. I am sure you'll get regular expressions advice that would help bankrupt a few companies, that I won't help you with because I want to sleep well at night with a clear conscience :) Cheers, Alex -----Original Message----- From: Fabien MORCAMP [mailto:fab...@by...] Sent: 08 March 2013 16:24 To: al...@ma... Cc: fai...@li... Subject: Re: [Fail2ban-users] fail2ban no match with regex Hi Alex, Thanks for your reply. But I install fail2ban to enjoy his powerful management. I know I can use robots.txt but I want to use fail2ban. If this module have been created, it's to use it! I didn't contact you about your bot cause it just a try. When I see my logs, I see your bots, so I try the regex on as it could be another! Behind their backs?? I didn't do anything behind their back! I try to solve a client which ask me about that. So I do it. I know what I do and what can I do. But I'm here to find a solution. This is an implemented module and I would like to know how to use it correctly. Maybe it's come from my regex with \ which are needed or not... Hope somebody can help me on. Fabien ----- Original Message ----- From: al...@ma... To: fai...@li... Sent: Vendredi 8 Mars 2013 15:59:47 Subject: Re: [Fail2ban-users] fail2ban no match with regex Fabien, Have you considered using robots.txt to stop Google from crawling your site? It's pretty easy thing to do, just read http://robotstxt.org I can see that in your regex you also excluding our bot (MJ12bot), why are you doing this when robots.txt can be used to control us and Googlebot? Have you contacted us directly with any problems related to our bot? I don't seem to remember seeing any messages from you, even though we've got dedicated bot page with contact email right on top of it. And by the way did you ask permission of your customers hosting at bysoft-hosting.com for you to implement such search engine blocks behind their backs? This can cause your customer sites to be delisted from Google if you block them from accessing those hosted sites. Is your company management aware of what you are doing to their own paying customer base? These are pretty serious questions that you need to consider. Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom -----Original Message----- From: Fabien MORCAMP [mailto:fab...@by...] Sent: 08 March 2013 14:50 To: fai...@li... Subject: [Fail2ban-users] fail2ban no match with regex Dear all, I'm new user of fail2ban and I've a problem with it. I active in jail.conf the [apache-badbots] section. In my Apache logs, I've this: 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Here my fail2ban configuration: /etc/fail2ban/jail.conf [apache-badbots] enabled = true filter = apache-badbots action = iptables-multiport[name=BadBots, port="http,https"] sendmail-buffered[name=BadBots, lines=5, dest=yo...@ex...] logpath = /var/log/httpd/website_access.log bantime = 172800 maxretry = 5 /etc/fail2ban/filter.d/apache-badbots.conf badbots = Mozilla/5\.0 \(compatible; Googlebot/2\.1; +http\://www\.google\.com/bot\.html\) failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ And when I try this regex with this command, I've not match... And I don't know why it didn't find anything... fail2ban-regex /var/log/httpd/website_access.log /etc/fail2ban/filter.d/apache-badbots.conf Results ======= Failregex |- Regular expressions: | [1] ^<HOST> -.*"(GET|POST).*HTTP.*"(?:User-Agent\: Mozilla/5\.0 |\(compatible; MJ12bot/v1\.4\.3; |http\://www\.majestic12\.co\.uk/bot\.php\?+\)|EmailCollector|WebEMailEx |trac|TrackBack/1\.02|sogou music spider)"$ | `- Number of matches: [1] 0 match(es) Somebody can help me to solve this? Thanks in advance! Fabien -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun |
From: Tom H. <to...@wh...> - 2013-03-09 15:04:22
|
On 08-03-13 17:58, al...@ma... wrote: > Dear Fabian, > >> I know I can use robots.txt but I want to use fail2ban. > > Why would you call legitimate crawlers that support robots.txt > standard as "bad bots" and block them in a way that does not give > information to the bot writers that the sites should not be crawled? Alex, with all due respect: this is a technical mailing list. I'd prefer to see the politics go somewhere else, especially when they come up in the aggressive form you addressed Fabians question. A simple pointer towards robotstxt.org would have been more than enough to state your concerns with his intentions. If Fabian is a decent sysadmin, he knows what he's doing (or at least he should be), and he's is doing whatever is best for his cause/business/clients. If he decides that he doesn't like your bot for some reason, he's free to block access in any way he likes: 'my server, my rules' applies. TBH: the tone you used, especially in your first reply, didn't invite me at all in contacting your company if *I* would have issues with your bot. Just my 2 cents. Kind regards, Tom |
From: <al...@ma...> - 2013-03-09 15:40:45
|
Hi Tom, > Alex, with all due respect: this is a technical mailing list. It's fail2ban mailing list. This software runs in real world and causes all sort of issues that are not technical in nature. I only noticed mention of our own bot at the bottom of Fabian's email, the reason I responded in the first place was because I saw Googlebot was about to get blocked. I've explained the likely consequences of such action, but clearly it has fallen on deaf ears. > TBH: the tone you used, especially in your first reply, > didn't invite me at all in contacting your company if *I* would have issues with your bot. I am sorry that you feel this way, but the right course of action is to always attempt to contact people who can deal with the problem. We don't hide who we are and we respond very quickly to any concerns regarding our bot. I don't believe Fabian made any attempt to contact us - I personally receive every single email related to our bot. You don't like my tone, but here is what I don't like - we have to deal regularly with fail2ban users who set it up in such a way that it sends automated ABUSE complaints to IP block owners, totally unjustified because they tend to have a few 404 Not Found errors, often on missing robots.txt! There is typically no indication as to what domain name in question is affected (so we can't stop crawling it) and also no information about the person who sent this email. Yet, we have to respond to every single abuse email like this: human cost on our end and on the other hand zero cost to send this automated spam driven by fail2ban. I've raised this issue on here but was told that it's the user configuration thing, so we have to waste our resources dealing with totally false complaints that are ultimately fuelled by fail2ban software. That's been going on for years now, how do you think I should feel about it? Blocking bots on your own servers is one thing, but making false abuse accusations is totally different, yet nobody seems to recognise it on here and I did not see any effort made to change fail2ban software to avoid such side effects. :( Alex -----Original Message----- From: Tom Hendrikx [mailto:to...@wh...] Sent: 09 March 2013 14:48 To: fai...@li... Subject: Re: [Fail2ban-users] fail2ban no match with regex On 08-03-13 17:58, al...@ma... wrote: > Dear Fabian, > >> I know I can use robots.txt but I want to use fail2ban. > > Why would you call legitimate crawlers that support robots.txt > standard as "bad bots" and block them in a way that does not give > information to the bot writers that the sites should not be crawled? Alex, with all due respect: this is a technical mailing list. I'd prefer to see the politics go somewhere else, especially when they come up in the aggressive form you addressed Fabians question. A simple pointer towards robotstxt.org would have been more than enough to state your concerns with his intentions. If Fabian is a decent sysadmin, he knows what he's doing (or at least he should be), and he's is doing whatever is best for his cause/business/clients. If he decides that he doesn't like your bot for some reason, he's free to block access in any way he likes: 'my server, my rules' applies. TBH: the tone you used, especially in your first reply, didn't invite me at all in contacting your company if *I* would have issues with your bot. Just my 2 cents. Kind regards, Tom ---------------------------------------------------------------------------- -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users |
From: Fabian W. <fa...@we...> - 2013-03-10 18:11:53
|
Hello Alex On 09.03.2013 16:40, al...@ma... wrote: > That's been going on for years now, how do you think I should feel about it? > Blocking bots on your own servers is one thing, but making false abuse > accusations is totally different, yet nobody seems to recognise it on here > and I did not see any effort made to change fail2ban software to avoid such > side effects. :( Why does this come up again? As it already has been told to you, the default install of fail2ban does not send out abuse notices. Also there is no jail enabled. The filters included in fail2ban, which do match on error 404, do this on quite limited special cases. See the thread "Trying to contact maintainers/developers of fail2ban" [1] in mailing list archive for details. [1] http://sourceforge.net/mailarchive/message.php?msg_id=29671920 If an user (aka SysAdmin) does enable the stuff which is already there in fail2ban, it will not create abuse notices because of missing robots.txt. You can not blame the software developers for mistakes clearly done by the end user who does adjust the configuration in such a way that it does match on all error 404 and also sending out abuse notices. For this to get working the user needs to create their own filter with own regex. I do understand the troubles you have with this abuse notices. As it has been discussed back in August 2012, it seems that you get this probably from a single user / organisation as the subject line of those e-mails could not been found anywhere else. This clearly indicates that it is custom made. Eventually you need to rethink your business model for your search engine and e.g. doing the crawling out of your own IP networks, which are registered to your company. Then you will get such abuse notices directly and are able to better deal with them. As long as you are depending on volunteers running your crawler from their home internet connection, the abuse notices will go to their ISP, which then will forward it to their end user, probably without disclosing the original sender. bye Fabian |
From: <al...@ma...> - 2013-03-10 19:14:38
|
Hi Fabian, > Why does this come up again? It comes up again because nothing was done by fail2ban to stop it - we continue to get such false abuse complaints regularly, here is snippet of "log evidence" from one of the most recent: --------------------- [Sat Feb 16 00:50:55 2013] [error] [client 46.165.197.151] File does not exist: /var/www/html/robots.txt [Sat Feb 16 00:50:59 2013] [error] [client 46.165.197.151] File does not exist: /var/www/html/about [Sat Feb 16 00:51:01 2013] [error] [client 46.165.197.151] File does not exist: /var/www/html/contact --------------------- Notice the first line? Robots.txt wasn't even present on the site. Almost all "abuse" complaints that we get are generated by fail2ban and basically consist of a few HTTP 404 Not Found errors. There was no contact information of the person who generated that email, no information about IP/domain that was accessed, so we can't even stop crawling it. Hosting provider can't even get back to them, so those people keep running fail2ban that sends out this fail2ban driven spam that has to be dealt with by abuse departments and us. > You can not blame the software developers for mistakes clearly done by the end user I don't think you are acting responsibly here because you just shift the blame to end users and basically wash your hands, even though it's your (assuming you are the person who maintains it) software that is being misused. Compare your attitude to ours - we have written a crawler that is run by volunteers BUT we deal with all queries that come up, including abuse complaints when our project members get them. We don't wash our hands saying - it's the end user's problem who is running OUR software. Now THAT's a responsible behaviour of a software writer, yours in my view isn't. Have you made any effort at all to stop misuse of fail2ban software? For example a big red font warning on how NOT to use it on the downloads page could have helped. It would certainly be a good start. If you want I can ask our company lawyer to prepare legal view on legality of emails that make false accusations of criminal behaviour (ie hacking), then you can give it to users of fail2ban so that they are fully aware of all the legal risks associated with certain custom configurations. I think your users need to appreciate them. > Eventually you need to rethink your business model for your search engine > and e.g. doing the crawling out of your own IP networks, which are registered to your company. We have a legitimate business model, there is no reason for us to change it. In any case we'd still have to deal with every abuse email because we are responsible people and won't ignore them just because we own IPs. "File does not exist" kind of errors are entirely normal for public web sites and should NOT be relied upon in the first place. This approach could have been justified if only specific well defined probes were detected, but the reality is that we get emails saying we were "hacking" server because login.php on some forum was requested... well it's linked to on every bloody page on the forum, of course we try to crawl it at some point! So maybe you need to rethink your "business model" of scanning error logs? :) Jokes aside, I think you are not treating this problem as seriously as we do. Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom |
From: Tom H. <to...@wh...> - 2013-03-10 20:35:14
|
On 10/03/13 20:14, al...@ma... wrote: > Hi Fabian, > >> Why does this come up again? > > It comes up again because nothing was done by fail2ban to stop it - we > continue to get such false abuse complaints regularly, here is snippet of > "log evidence" from one of the most recent: > All of this has already been discussed, including responsibility of the the fail2ban developers, end-users, ISPs relaying complaints, and your company. I don't even know by heart what the final statements were, but I'm pretty much sure that this horse has been beaten to death a long time ago. So please stop it. fail2ban works really simple: - you can write *any* regex - apply that regex on *any* log file - execute *any* command based on that Sensible and useful defaults are added by the developers, and none of them do any harm to your business. That will never stop users from adding other configurations. Maybe your legal department could write a standard reply template for f2b generated complaints targeted at the both the end user and the relaying ISP, and use that to aid your volunteers. You're free to add pointers to all of your provided arguments and the f2b community agreeing (or not) with them, available in the mailing list archives. You're also free to contribute a short (5 lines?) text explaining your concerns on apache access blocking on the wiki [1], but you have to accept the simple fact that f2b cannot block users in any way from configuring the software to do something you don't like. [1] http://www.fail2ban.org/wiki/index.php/Apache Kind regards, Tom |
From: <al...@ma...> - 2013-03-10 21:25:26
|
Hi Tom, > Maybe your legal department could write a standard reply template for f2b > generated complaints targeted at the both the end user and the relaying ISP, We've done that last year, it helped but not solved the root problem - we still have to waste our time dealing with every abuse complaint, more acute problem is that ISPs of our volunteers often don't show original abuse complaint so there is no way to provide well-argued response that would point out that a few 404 Not Founds is hardly "abuse" or "hacking". > you have to accept the simple fact that f2b cannot block users in any > way from configuring the software to do something you don't like. It's not about my likes or dislikes, we merely want to stop or at least greatly reduce number of instances when we get falsely accused of criminal behaviour without having to take legal action against everybody who does that. It's not too much to ask for, isn't it? Imagine somebody regularly called the police (anonymously), gave them your name and address saying that you are a thief, and attached some photo of you walking the screen as evidence. It's not nice, especially when it's anonymous. So I just want to deal with those damned false abuse email spam, you can keep blocking bots anyway you want, even Googlebot if you really want it :) > You're also free to contribute a short (5 lines?) text explaining your concerns > on apache access blocking on the wiki [1], but I'll get back to you with this text soon. It might be more than 5 lines, but hey we can just create a new Wiki page dedicated to legal risks of defamation when using fail2ban software and link to it from relevant pages. In the short term it would be acceptable to us to know that your users had clear warning that misuse of software can result in legal liability that is THEIRS, not fail2ban's (I know you already disclaim everything, but not sure many users fully understand that it makes THEM liable). Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom |
From: Tom H. <to...@wh...> - 2013-03-10 23:32:46
|
On 10/03/13 22:25, al...@ma... wrote: > Hi Tom, > >> Maybe your legal department could write a standard reply template for f2b >> generated complaints targeted at the both the end user and the relaying > ISP, > > We've done that last year, it helped but not solved the root problem - we > still have to waste > our time dealing with every abuse complaint, more acute problem is that > ISPs of our volunteers often don't show original abuse complaint so there is > no > way to provide well-argued response that would point out that a few 404 Not > Founds > is hardly "abuse" or "hacking". > You really can't expect an ISP to break their customers privacy for you: their customers are valuable to them, you aren't. You should simply try to facilitate them in educating their customer, or simply ignore complaints from that specific ISP, due to "Abuse department stupidity". If they keep sending useless complaints. they don't take any effort to help you, or their customer, so nothing good will come of any work you do. I understand you have a reputation to manage, but you can do only that much. Maybe if you ignore the ISP long enough, the end user will become fed up with your recurring 'attacks', and he'll address you directly through the URL in the user-agent. Final note on this subject: I've manned an abuse department for some time until last year. Abuse handling sucks :) They probably hate the nitwit customer just as much as you do, they just can't be bothered to fix your problem for you. So if no action is taken and complaints keep coming, then either they never read your reaction (tl;dr, no time) or the customer wouldn't take your suggestions for an answer. Relaying complaints probably makes the customer feel heard, so that's all they do. >> you have to accept the simple fact that f2b cannot block users in any >> way from configuring the software to do something you don't like. > > It's not about my likes or dislikes, we merely want to stop or at least > greatly > reduce number of instances when we get falsely accused of criminal behaviour > without having to take legal action against everybody who does that. > > It's not too much to ask for, isn't it? Imagine somebody regularly called > the police > (anonymously), gave them your name and address saying that you are a thief, > and attached > some photo of you walking the screen as evidence. It's not nice, especially > when it's anonymous. Yes, and now you're complaining to the phone company that they enable anonymous persons to file false complaints at the police department :) Seriously, you're barking up the wrong tree here. >> You're also free to contribute a short (5 lines?) text explaining your > concerns >> on apache access blocking on the wiki [1], but > > I'll get back to you with this text soon. It might be more than 5 lines, > but hey we can just create a new Wiki page dedicated to legal risks of > defamation > when using fail2ban software and link to it from relevant pages. The reason that I mentioned the 5 lines was exactly because you'd want a short and clear suggestion on the relevant page(s). If you want to write a complete legal book on a separate wiki page, that's fine with me, but no one will be bothered to read it then. tl;dr et cetera. You can contribute all kinds of legal disclaimers to open source software, but it won't help. Any sysadmin can use iptables to block ip space, do you send a similar request to LKML too? -- Tom |
From: <al...@ma...> - 2013-03-11 01:38:47
|
> You really can't expect an ISP to break their customers privacy for you: > their customers are valuable to them, you aren't. I think you misunderstood the situation: the ISP of our project member who is their customer sends them (project member) abuse complaint, when the project member asks for original abuse complaint they often don't get it from ISP even though project member is customer of the ISP. > Maybe if you ignore the ISP long enough, the end user will become fed up > with your recurring 'attacks', and he'll address you directly through > the URL in the user-agent. When we get abuse message we are often given 24 hours to respond, otherwise our crawling servers can be turned off. Our project members would get threat of losing their internet access, in some cases there is only one provider in the area, so it's a serious threat. > They probably hate the nitwit customer just as much as you do > they just can't be bothered to fix your problem for you For sure, but ISPs have duty to pass abuse complaint to their customer (ie us or project member), Otherwise they'd be accepting responsibility for possible actions of that customer. > Yes, and now you're complaining to the phone company that they enable > anonymous persons to file false complaints at the police department :) > Seriously, you're barking up the wrong tree here. Do you understand that in the event of police time wasted like this they'd use their powers to find out who does it even if the phone number was "withheld"? The point is that people who accuse others of criminal behaviour should not do that Unless they got evidence to prove it - a few lines from log showing 404 not founds is nothing. > You can contribute all kinds of legal disclaimers to open source > software, but it won't help. Any sysadmin can use iptables to block ip > space, do you send a similar request to LKML too? They can block whatever they want, but if they send false abuse complaints based on nothing then it should stop because it wastes our time and money. A few times we had abuse complaints which basically shown firewall log of TCP/IP connections on port 80 :( Alex |
From: Fabian W. <fa...@we...> - 2013-03-10 21:25:50
|
Hello Alex I just want to clarify some points, but I will not continue this discussion any more. The questions below are only rhetorical and do not need to be answered. On 10.03.2013 20:14, al...@ma... wrote: >> Why does this come up again? > > It comes up again because nothing was done by fail2ban to stop it - we How does the manufacturer of your car stop you from driving to fast and then getting a speeding ticket? He does nothing, he sells you a car which is clearly capable to drive lot faster then the maximum 120 km/h which are allowed (e.g. here in Switzerland). It is the drivers responsibility to drive below or at the allowed speed for the road driving on. >> You can not blame the software developers for mistakes clearly done by the > end user > > I don't think you are acting responsibly here because you just shift the > blame to end users and basically wash your hands, even though it's your > (assuming you are the person who maintains it) software that is being > misused. I am just a happy and responsible user of fail2ban, I am not a developer. > Compare your attitude to ours - we have written a crawler that is run by > volunteers BUT we deal with all queries that come up, including abuse > complaints when our project members get them. We don't wash our hands saying > - it's the end user's problem who is running OUR software. Now THAT's a > responsible behaviour of a software writer, yours in my view isn't. Even I am not a developer, it is always the users (SysAdmins) responsibility to use a software. The developer does not force the user to use his software. The same as your car manufacturer does not force you to drive the car at full speed, even if it can be done. > Have you made any effort at all to stop misuse of fail2ban software? How should the developer do this? Everybody can install and use the software, without the knowing of the developer. And as already stated many times, the default install does not do anything on error 404 and it does also not send out abuse notices. > For example a big red font warning on how NOT to use it on the downloads > page could have helped. It would certainly be a good start. I guess it is asking to much, as there are to many possibilities to configure fail2ban (or any other software) in a way, that it will hurt a third party. This clearly belongs to the responsibility of the user (SysAdmin) configuring the software. He does know his system the best, and he has to decide how the software can fit his needs. If he is sending out abuse notices, it is his own responsibility. So next time with such a complain, write an answer to the ISP and asking him to forward this to the original sender. You have, even if it is through a third party (the complaining ISP), the much better possibility to contact (or inform) this user of fail2ban. > Jokes aside, I think you are not treating this problem as seriously as we > do. I am, as I do not use the abuse notifications and I also do not block access based on random error 404 in the Apache logs. bye Fabian |
From: <al...@ma...> - 2013-03-10 22:15:31
|
Hi Fabian, > How does the manufacturer of your car stop you from driving to fast and then getting a speeding ticket? Car manufacturer will be liable if they don't provide accurate speedometer. Since car manufacturers do that, it means that the user is willfully breaking the law when speeding, hence liability is with the user. If car manufacturer sold inherently flawed product (like in my view fail2ban is when it uses 404 Not Found from error log) then in all probability it will be shutdown very quickly. > I am just a happy and responsible user of fail2ban, I am not a developer. In this case my words that were directed to developers/maintaners do not apply to you personally. Alex Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom |
From: Fabian W. <fa...@we...> - 2013-03-10 23:35:51
|
Hello Alex On 10.03.2013 23:15, al...@ma... wrote: > If car manufacturer sold inherently flawed product (like in my > view fail2ban is when it uses 404 Not Found from error log) > then in all probability it will be shutdown very quickly. For the last time, it does not happen even if all the filters delivered with fail2ban are activated. The user / sysadmin of fail2ban needs to create his own filter with his own regex to match on the error 404 you are mentioning. This is like tuning the car, so that it drives faster then what the manufacturer has done. bye Fabian |
From: Ben J. <be...@in...> - 2013-03-10 23:39:31
|
Pardon me for stepping-in, but the problem seems clear: some distribution is including the accusatory email notification "off-the-shelf". This action is causing Alex *and* the fail2ban developers frustration. As Tom H. noted, this issue has been beaten to death on this mailing list, and everybody (from both sides of the argument) is tired of hearing about it. I agree with both sides; Alex's concerns are entirely valid, but so are the responses. To continue with the car analogy, configuring fail2ban to send a threatening/potentially-defamatory message out-of-the-box is akin to a car dealership (*not* the car manufacturer) configuring a car to accelerate to maximum speed as soon as the engine is started. The car is doing something that should be possible, technically speaking and at the operator's discretion, but at the same time the car is doing something that is entirely inadvisable and reckless. So, Alex, it seems that you are "barking up the wrong tree", so to speak. If I were you, I would make every effort to identify which distributions and package maintainers are authoring and including these default configurations that are causing you so much heartache. Then, I would file bug reports or whatever is necessary to force them to consider your argument, which, again, I find to be entirely valid. I'm not in a position to look back through the list archives at the moment; did we ever establish which GNU/Linux distribution was including this "problem configuration"? -Ben |
From: Fabian W. <fa...@we...> - 2013-03-11 00:07:48
|
Hello Ben On 11.03.2013 00:19, Ben Johnson wrote: > Pardon me for stepping-in, but the problem seems clear: some > distribution is including the accusatory email notification > "off-the-shelf". This action is causing Alex *and* the fail2ban [...] > I'm not in a position to look back through the list archives at the > moment; did we ever establish which GNU/Linux distribution was including > this "problem configuration"? Alex mention a specific subject of the abuse notices he got last August. This line could not be detected anywhere (not in fail2ban) and also not with Google. So it seems that it is a custom line, which probably does not belong to any open source OS distribution or package. bye Fabian |
From: <al...@ma...> - 2013-03-11 01:51:16
|
Hi Ben, Thanks for your input. > Pardon me for stepping-in, but the problem seems clear: some distribution is > including the accusatory email notification "off-the-shelf". This is plausible, my feeling is that such functionality was turned on by default rather than enabled by the end user. If anybody is aware of any derived products based on fail2ban then please share. > If I were you, I would make every effort to identify which distributions > and package maintainers are authoring and including these default configurations > that are causing you so much heartache. Well, I guess we'd have to do a lot of research and testing to find out who might be using fail2ban in different package with default abuse emails on. I've tried asking some of the people who made abuse complaints what they use but did not get much further than they use fail2ban, vast majority of them send abuse emails without information about their site or return email address. The only way to get to them would be to take legal action and subpoena their ISP, this will cost a lot of money and we'll just end up with some poor chap who used default config on some free download. That's why I am very frustrated because the only lead I have is fail2ban but it feels like dead end. Best regards, Alex Chudnovsky Managing Director Majestic-12 Ltd (t/a Majestic-SEO) Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom |
From: Fabian W. <fa...@we...> - 2013-03-10 18:11:51
|
Hello Fabien On 08.03.2013 15:50, Fabien MORCAMP wrote: > I'm new user of fail2ban and I've a problem with it. > I active in jail.conf the [apache-badbots] section. > > In my Apache logs, I've this: > > 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" The filter you are using does state "badbots", why do you want do use this on Google bot? I suppose the default regex of the apache-badbots filter does contain crawlers which do not follow the robots.txt file and those it could be considered using fail2ban to block them. As other already have stated, I also recommend using the robots.txt to block the Google bot from accessing certain web sites. See the help at [1] and [2]. [1] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449 [2] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93708 bye Fabian |
From: Fabien M. <fab...@by...> - 2013-03-11 10:43:51
|
Hi Fabian, I try to use cause fail2ban offer this possibility and my client wants to know if we can stop it with it. So I try to do it, but I don't find why it doesn't works... You thinking about the default regex which may to have some modifications? failregex = ^<HOST> -.*"(GET|POST).*HTTP.*"(?:%(badbots)s|%(badbotscustom)s)"$ But is there a helper somewhere which describe significations of all characters? I already use robots.txt to disallow access on specific sudfolders. My principal motivation is to know how to use fail2ban correctly with theses options! Thanks for your answear! Kind regards, Fabien ----- Original Message ----- From: "Fabian Wenk" <fa...@we...> To: fai...@li... Sent: Dimanche 10 Mars 2013 18:46:44 Subject: Re: [Fail2ban-users] fail2ban no match with regex Hello Fabien On 08.03.2013 15:50, Fabien MORCAMP wrote: > I'm new user of fail2ban and I've a problem with it. > I active in jail.conf the [apache-badbots] section. > > In my Apache logs, I've this: > > 66.249.73.131 - - [08/Mar/2013:15:12:43 +0100] "GET /some/page/of/website HTTP/1.1" 200 13256 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" The filter you are using does state "badbots", why do you want do use this on Google bot? I suppose the default regex of the apache-badbots filter does contain crawlers which do not follow the robots.txt file and those it could be considered using fail2ban to block them. As other already have stated, I also recommend using the robots.txt to block the Google bot from accessing certain web sites. See the help at [1] and [2]. [1] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449 [2] http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93708 bye Fabian ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun |
From: Fabian W. <fa...@we...> - 2013-03-11 12:30:53
|
Hello Fabien On 11.03.2013 11:43, Fabien MORCAMP wrote: > I try to use cause fail2ban offer this possibility and my > client wants to know if we can stop it with it. Sure, it can be done, but as I already said, use robots.txt to tell the Googlebot what it can or can not get from the website. In this case I will not create / suggest a regex for you. > So I try to do it, but I don't find why it doesn't works... > You thinking about the default regex which may to have some > modifications? It will not help, as this regex is only for badbots, as the filter name indicates. You need to create your own filter with an own regex. There are enough examples in the mailing list archive. Use fail2ban-regex for testing your own filter / regex. > But is there a helper somewhere which describe significations > of all characters? Check out the "4.3 Filters" part in the manual at [1]. [1] http://www.fail2ban.org/wiki/index.php/MANUAL_0_8#Filters > I already use robots.txt to disallow access on specific sudfolders. > My principal motivation is to know how to use fail2ban > correctly with theses options! Do not use fail2ban to block "good" crawler / spider bots. Use fail2ban only to block violent abuse, e.g. username/password brute force attacks or such. I did create a few filters [2]. But they do only ban on wrong username/password. Only one of them also does ban when spambots try do to brute force recipient e-mail addresses at my mail server or try other abuses (like trying to relay mail through my server), which would just generate a lot of 'reject' log entries. [2] http://www.wenks.ch/fabian/fail2ban/ > Thanks for your answear! You're welcome. PS: No need to use "reply all", reply only to the list is perfect, as I do filter e-mails based on the "List-Id" header line. bye Fabian |
From: Fabien M. <fab...@by...> - 2013-03-11 13:44:05
|
Fabian, Thanks for your answer. I will do it by robots.txt Thanks for the link. I will try regex modifications to know how to use it! Now I will reply only to distribution list! Kind regards, Fabien ----- Original Message ----- From: "Fabian Wenk" <fa...@we...> To: fai...@li... Sent: Lundi 11 Mars 2013 13:30:42 Subject: Re: [Fail2ban-users] fail2ban no match with regex Hello Fabien On 11.03.2013 11:43, Fabien MORCAMP wrote: > I try to use cause fail2ban offer this possibility and my > client wants to know if we can stop it with it. Sure, it can be done, but as I already said, use robots.txt to tell the Googlebot what it can or can not get from the website. In this case I will not create / suggest a regex for you. > So I try to do it, but I don't find why it doesn't works... > You thinking about the default regex which may to have some > modifications? It will not help, as this regex is only for badbots, as the filter name indicates. You need to create your own filter with an own regex. There are enough examples in the mailing list archive. Use fail2ban-regex for testing your own filter / regex. > But is there a helper somewhere which describe significations > of all characters? Check out the "4.3 Filters" part in the manual at [1]. [1] http://www.fail2ban.org/wiki/index.php/MANUAL_0_8#Filters > I already use robots.txt to disallow access on specific sudfolders. > My principal motivation is to know how to use fail2ban > correctly with theses options! Do not use fail2ban to block "good" crawler / spider bots. Use fail2ban only to block violent abuse, e.g. username/password brute force attacks or such. I did create a few filters [2]. But they do only ban on wrong username/password. Only one of them also does ban when spambots try do to brute force recipient e-mail addresses at my mail server or try other abuses (like trying to relay mail through my server), which would just generate a lot of 'reject' log entries. [2] http://www.wenks.ch/fabian/fail2ban/ > Thanks for your answear! You're welcome. PS: No need to use "reply all", reply only to the list is perfect, as I do filter e-mails based on the "List-Id" header line. bye Fabian ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users -- MORCAMP Fabien Ingénieur systèmes Mail: fab...@by... 2, rue des nonettes 77000 Melun |