|
From: Roan H. <Ro...@Ho...> - 2007-05-11 20:53:37
|
Here's a page with a php based solution: How to build a Bot Trap and keep bad bots away from a web site Block spam bots and other bad bots from accessing and scanning your web site http://www.kloth.net/internet/bottrap.php Roan Jon Phillips wrote: > On Wed, 2007-05-09 at 14:00 -0700, Victor Stone wrote: > >> On 5/8/07, Jon Phillips <jo...@re...> wrote: >> >>> Victor, is the code to do the nasty bot trapping in main ccHost, or is >>> it ccMixter and/or is there a way to set it up? >>> >>> Open Clip Art Library has been getting reamed and had to take emergency >>> actions because some bots coming from china are ignoring robots.txt. >>> >>> Am I remembering correctly that you added a nice fix to ccmixter.org to >>> solve this? I would like to add to openclipart.org >>> >> Actually for the china bots I took OCA's advice and just used IPTABLES >> to block them up front. >> >> The code you're talking about is 5 lines of code that are 'hidden' to >> anybody using the site legit. It adds the incoming IP to the deny >> section of .htaccess. I don't promote wide use of this technique >> because you have to be SURE the links are hidden from legit users >> otherwise their IP will give them 403's for the entire site. There's >> no way to tell how many legit ccMixter users are in this state since >> they are, well, cut off from the site. >> >> VS >> > > >From looking at the logs, it seems that the site is getting reamed from > the multiple tags. > > Damn bots that don't respect robots.txt. Looks like gigablast.com's bots > are not respecting either! > > Hmmm...might have to limit the tag search to 3-4 tags, and/or figure out > a way to make it less juicy for search bots...any suggestions on this > front? > > thx as always vs :) > > Jon > > |