[htdig] Acroread Message fixed

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi All,

Thank you all for your help. We took the easy way out for now by putting
in the .pdf and .PDF in the bad_extensions line in the htdig.conf. We'll
be going forward with making htdig reference the pdf files in the near
future.

Steve Lewis
Manager of IT, QA and Manufacturing
Lumeta Corporation
sl...@lu...
=20
732 357-3523 Voice
732 618-6006 Cell
866 213-5250 Pager
=20
AIM creativerecords
=20

-----Original Message-----
From: Jim [mailto:li...@yg...]=20
Sent: Tuesday, September 21, 2004 3:07 AM
To: Steve Lewis
Cc: htd...@li...
Subject: Re: [htdig] Acroread message

On Fri, 17 Sep 2004, Steve Lewis wrote:

> I'm new to HtDig and have one issue that is bothersome but not a big
> problem. Everytime someone uses the search engine on our site I get a
> message from our cron job as follows:
>
> PDF::parse: cannot find pdf parser /usr/local/bin/acroread

It sounds like htdig is encountering some PDFs and trying to use the=20
default handling mechanism, which is failing due to acroread not being=20
found. If you really want to index the PDFs, you should probably start
by=20
reading http://www.htdig.org/FAQ.html#q4.9. If you don't care about the=20
PDFs and just want to get rid of the message, it would probably be
easiest=20
to just add .pdf to the bad_extensions attribute.

   http://www.htdig.org/attrs.html#bad_extensions

Btw, I suspect what is happening is that you are getting the message you

refer to each time cron tries to execute rundig. Not each time someone=20
uses the search engine on your site. The site search calls htsearch
which=20
doesn't try to parse PDFs or do anything with cron.

Jim