From: <al...@ma...> - 2012-08-13 21:58:11
|
Hi, Thank you for taking time to respond to this. I've now found Cyril's email and sent him this message directly. It may come to you as a surprise but 404 Not Found errors are very common in web scale crawl - anybody can create wrong URLs pointing to any site and search engine bots won't have a clue those URLs don't exist until they actually get crawled. There were 6 more 404 Not Found errors in that log snippet that was shown in email - sadly often those emails don't make it clear which domain name in question they relate to, this makes dealing with emails very hard because the email of the person on behalf of whom abuse email was sent is typically not shown when hosting provider forwards such emails to the server owner (ie us). This means that this logic of treating a few 404s as something very bad (such as hacking) is a very naïve logic that causes a lot of false positives - we've probably spent at least a few thousand dollars dealing with such emails so far, none of which had proper evidence that would back up claim of "hacking". Our view is that accusing unknown people of "ABUSE/HACKING" when a few errors are found in log file is NOT reasonable. "Hacking" is typically a criminal offense in many jurisdictions around the world and normally it is a good idea to avoid accusing people of criminal offence without having rock solid evidence that was reviewed by police/prosecutors. This is a pretty important legal aspect that I think was, perhaps, not fully considered in creation of this software. Our crawler respects the right of web masters to control what it crawls - that's why we support robots.txt, but we can't accept situation of false complaints being made against us by software that generates so many false positives. > I think you need to reply to the owners of each block that you have received complaints from and ask them why they think you are improperly accessing their sites. People who run this software are normally hidden from the email we get - not even domain name typically shown so we can't even stop crawling some of those domains and hosting providers who'll receive this abuse email typically would not tell us who made complaint due to client confidentiality reasons. As you can see this got us very frustrated so we had no choice but to try to contact people who developed this software. > A complaint to the package maintainers will not help. We don't want to complain - we want to solve this problem of false positives that is costing us a lot of money - dealing with each person who installs this software is not efficient (especially when they are hidden from us). I'll wait for Cyril's response to our message and see if some solution can be found to the multiple false positives that fail2ban generates. Cheers, Alex -----Original Message----- From: Yehuda Katz [mailto:ye...@ym...] Sent: 13 August 2012 22:36 To: al...@ma... Cc: fai...@li... Subject: Re: [Fail2ban-users] Trying to contact maintainers/developers of fail2ban Fail2Ban does not treat 404s (or anything else for that matter) as hacking. The purpose of Fail2Ban is to detect MULTIPLE errors, whether they are login errors (wrong username or password), or any other kind of error. The maintainers of Fail2Ban have no control over how people configure their software. If a user has set Fail2Ban to ban your bot after a single 404 on robots.txt, then there is nothing the Fail2Ban maintainers can do about it. It certainly does not do that by default. Either way, the notification you forwarded looks strange to me. When I send abuse reports to IP contacts, I always include at least 6 log lines that obviously demonstrate a hacking attempt. One single line could represent an honest mistake (as in your case: you have no way of knowing that the file does not exist). I think you need to reply to the owners of each block that you have received complaints from and ask them why they think you are improperly accessing their sites. A complaint to the package maintainers will not help. - Y (not a maintainer, just a volunteer) On Mon, Aug 13, 2012 at 4:54 PM, <al...@ma...> wrote: Hi, I've tried to find contact information for maintainers/developers of fail2ban but could not find anything better than this list. If that's not the right place to report one problem related to this software then please point me to the right direction. I represent Majestic 12 Distributed Search Engine project which has been crawling the web since late 2004. You can find more information about us on our site here: http://www.majestic12.co.uk/ Recently we've come across with persistent problem of our hosting providers getting automated abuse emails sent to IP block owners with the following subjects: ------------------------------------------------------ ABUSE NOTIFICATION: HACK ATTEMPT We have detected abuse from the IP address XX.XX.XX.XX, which according to a whois lookup is on your network. We would appreciate if you would investigate and take action as appropriate. Log lines are given below, but please ask if you require any further information. (If you are not the correct person to contact about this please accept our apologies - your e-mail address was extracted from the whois record by an automated process. This mail was generated by Fail2Ban.) Note: Local timezone is -0400 (EDT) /var/www/clients/EXAMPLE.com/logs/error.log:[Wed Aug 08 07:55:05 2012] [error] [client XX.XX.XX.XX] File does not exist: /var/www/clients/ EXAMPLE.com/htdocs/robots.txt ------------------------------------------------------ This is the typical email and as you can see it treats simple HTTP 404 Not Found error on robots.txt file (which all good bots attempt to retrieve in order to comply with robots.txt standard - http://en.wikipedia.org/wiki/Robots_exclusion_standard ). These false allegations cause substantial waste of time and money to deal with both for us and also hosting providers who have to deal with these automated emails. It also has got potential to expose people who run your software to legal action because making accusations of "hacking" and other illegal activity should be something that is done lightly without thorough legal review. >From our end our crawler always identifies itself in User-Agent and support robots.txt (including some non-standard directives). It would be very helpful if whoever maintains this software considered all implications of treating simple 404 errors as sign of "hacking", and more importantly reviewed policy of automatic abuse emails that make such serious accusations when there is no evidence actually given (we've got plenty of examples when it happened). Best regards, Alex Chudnovsky Managing Director Email : al...@ma... Web : http://www.majestic12.co.uk Majestic-12 Ltd Faraday Wharf, Holt Street Birmingham Science Park, Aston Birmingham, B7 4BB United Kingdom ---------------------------------------------------------------------------- -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Fail2ban-users mailing list Fai...@li... https://lists.sourceforge.net/lists/listinfo/fail2ban-users |