Whe htdig GETs a document which requires an external parser, it writes it to a temporary file, passes the name of the temporary file (plus the URL and MIME-type) to the external parser and waits for it to return.  I don't see that it would help having the external parser on a different machine. 
For the best, use the doc2html.pl and pdf2html.pl scripts together with pdftotext and pdfinfo from version 1.0 of xpdf.
David Adams
Computing Services
Southampton University
----- Original Message -----
From: Ryan Spaulding
To: htdig-general@lists.sourceforge.net
Sent: Monday, February 25, 2002 5:08 PM
Subject: [htdig] PDF issues


Hello all,


I am new to this board. I am having issues indexing pdf’s. I have one machine doing the digging and then scping the db files up to my external web server. Does the external parser (pdftotext) have to be on the same machine too?  Also which external parser is the best?


Thanks in advance



Chat with friends online, try MSN Messenger: Click Here
_______________________________________________ htdig-general mailing list To unsubscribe, send a message to with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html