I am getting a lot of activity from the cuil robot Twiceler that is comming up with
07.12.2009 16:31:17 216.129.119.12 Anonymous - UA>Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)<
07.12.2009 16:31:17 216.129.119.12 Anonymous - MSG>Blocked crawler detected; script terminated.
Twiceler is in the white list so do not understand why it is getting blocked
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have been monitoring Twiceler since it started pinging my sites many months ago. Initially I blocked it, but after giving it a trial I changed my mind. I have a number of sites with robot.txt instructions and it has never disobeyed any of the instructions. Remember, just because it is on the internet does not make it truth, there are conspiracy theorists behind every tree on forums. Twiceler may have had problems in its early days and is being punished for it for ever.
Tonykgv
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am getting a lot of activity from the cuil robot Twiceler that is comming up with
07.12.2009 16:31:17 216.129.119.12 Anonymous - UA>Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)<
07.12.2009 16:31:17 216.129.119.12 Anonymous - MSG>Blocked crawler detected; script terminated.
Twiceler is in the white list so do not understand why it is getting blocked
<<Twiceler is in the white list so do not understand why it is getting blocked>>
The hard-coded list of "bad bots" in session_spider.php includes "oBot", which matches anything with the word "robot" in the name.
Maybe the match shouldn't be case-insensitive?
Maybe the match should consider word boundaries?
Check the Google listings for "twiceler-0.9" .
You'll find lots of messages about Twiceler not respecting robots.txt and gobbling up a lot of bandwidth.
There's also mention of Twiceler using several different IP address ranges, some of which don't resolve properly when you do a reverse DNS lookup.
The fact that it doesn't honour robots.txt is enough to get it on the "bad bots" list.
I have been monitoring Twiceler since it started pinging my sites many months ago. Initially I blocked it, but after giving it a trial I changed my mind. I have a number of sites with robot.txt instructions and it has never disobeyed any of the instructions. Remember, just because it is on the internet does not make it truth, there are conspiracy theorists behind every tree on forums. Twiceler may have had problems in its early days and is being punished for it for ever.
Tonykgv