From: Ninti S. <of...@ni...> - 2004-04-05 07:47:49
|
After a bit of fiddling, I now have HtDig running the external parsers required for PDF and DOC files. All parsers work when run from the command line. max_file_size is OK. HtDig is finding these file types and apparently indexing them at least partly (titles and/or metadata only it seems). rundig -v -v -v is showing no problems, the files are in the output along with everything else without complaints. A search can turn the files up if search terms are carefully selected. The documents are listed with their titles enclosed in square brackets. The excerpt text however is not useful, it is simply a repeated ("Read 8192" or something similar). It appears that the content is not being indexed even though the system doesn't actually complain of or indicate any specific problems. Anyone seen/solved this before? TIA, Mick |