Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#237 empty disallow: keyword creates match-all pattern

open
nobody
other (29)
5
2004-08-26
2004-08-26
Anonymous
No

When htdig reads a robots.txt file of the form:

User-agent: *
Disallow:

it creates a pattern from the empty string found after the
disallow: statement, which unfortunately matches everything.

This can be fixed by wrapping the section that creates the
pattern with a test for non-zero length of the pattern prior to
being pre-pended with the protocol specifier:

if(strlen(pattern) > 0) {
String fullpatt = "^[^:]*://[^/]*(";
fullpatt << pattern << ')';
cout << "Full pattern: " << fullpatt;
_disallow.set(fullpatt, config->Boolean("case_sensitive"));
}

(from htdig/Server.cc, around line 344 or so).

For more details, please contact schampeo@hesketh.com.

Thanks,
Steve

Discussion