Investogating ht://dig for possible use as an intranet search, I nearly dismissed it out of hand.

The "Features and Requirements" page says:

Searching of HTML and text files
Both HTML documents and plain text files can be searched. Searching of other file types will be supported in future versions.

which translates to "searching files other than HTML and plain text is not supported in the current version".

Fortunately I did not believe this, and eventually found under the FAQ a passing reference to PDF in 1.13 and then in sections 4.8 and 4.9 the concept of an external parser or converter.

I think your Features page should say something like:

Searching of many file types
Both HTML and plain text are searched natively, and PDF, Word, Excel, Powerpoint documents via external helpers. Support for further file types can be added via a plugin system."

Incidentally, I also think your homepage should include approximately the first paragraph of the Introduction where it can be seen, BEFORE the News. Remove references to defunct search engine Infoseek etc.

See http://www.useit.com/alertbox/20020512.html


