The function is_me() is used to to match the robots
short-name to the User-Agent line.
According to the source (v 1.33):
# See whether my short-name is a substring of the
# "User-Agent: ..." line that we were passed:
It does this by using the perl built-in function index().
However it apparently calls index() and puts the
arguments in backwards, causing index() to never match
a sub-string of the User-Agent line.
Accorsing to `man perlfunc`, the signature of index is
the following:
index STR,SUBSTR,POSITION
index STR,SUBSTR
The index function searches for one string within
another, but without the wildcard-like behavior of a
full regular-expression pattern match. It returns the
position of the first occurrence of SUBSTR in STR at or
after POSITION. If POSITION is omitted, starts
searching from the beginning of the string. The return
value is based at 0 (or whatever you've set the $[
variable to--but don't do that). If the substring is
not found, returns one less than the base, ordinarily "-1".
WWW::RobotRules currently has this backwards and calls
index() with the short-name in the first argument and
the User-Agent line in the second, which will only
match if the User-Agent line is exactly the short-name.
This is not often the case.
Anyway, I'm attaching a diff to correct this issue in
WWW::RobotRules. It seems this bug has existed for some
time now, and its an easy one to squash.
-Chris Heller
Unified Diff against RobotRules 1.33, fixes small error in the is_me() subroutine.