From: David A. <D.J...@so...> - 2003-03-27 15:18:35
|
Your external_parsers statement is in error, and you may also be truncating the PDF file. See FAQ 4.9: http://www.search.soton.ac.uk/htdig/FAQ.html#q4.9 and FAQ 5.2 http://www.search.soton.ac.uk/htdig/FAQ.html#q5.2 David Adams Corporate Information Services Information Systems Services University of Southampton ----- Original Message ----- From: "Anne Durand" <ann...@ga...> To: <htd...@li...> Sent: Thursday, March 27, 2003 3:01 PM Subject: [htdig] indexing pdf files > Hello > When I run on command line > doc2html.pl /full/path/to/sample/Maison_Guiette.pdf "application/pdf" url > I don't get any error and the parsing looks ok. > > The htdig.conf file contains > external_parsers: application/pdf /usr/local/bin/doc2html.pl > > When I run htdig, I get the following errors : > Error (0): PDF file is damaged - attempting to reconstruct xref table... > Error: Top-level pages object is wrong type (null) > Error: Couldn't read page catalog > External parser error in line:<HTML> > URL: http://www.archi.fr/UIA/htmEdifices/DOCOMOMO/Belgium/Maison_Guiette.pdf > External parser error in line:<HEAD> > .... > > Thanks for any suggestion > @nne > > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: > The Definitive IT and Networking Event. Be There! > NetWorld+Interop Las Vegas 2003 -- Register today! > http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en > _______________________________________________ > htdig-general mailing list <htd...@li...> > To unsubscribe, send a message to <htd...@li...> with a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html > |