From: Steven N. <st...@ou...> - 2003-03-31 12:11:01
|
Hi, I recently endavoured into an htDig installation adventure and am currently stimified w.r.t. differences in behaviour between two installs on mostly similar servers. I have everything up & running OK on a RH7.2 machine (Perl 5.6), and also did an install on a Redhat 8 machine, which comes with Perl 5.8. I'm quite explicit about these Perl versions, since a) I tried to do my homework and believe both machines have been setup similarly, factoring out simple installation errors, b) commandline invocations of the different tools (antiword, xlhtml, ppthtml and pdftotext) work flawlessly on both machines, while c) invocation of either doc2html and pdf2html don't work correctly on the RH8 machine with the newer version of Perl. I'm not in the position to swap versions of Perl on the RH8 machine, but am seeking confirmations that this new Perl version could effectively be the possible cause of my installation nightmare. One example of strange behaviour: ./doc2html.pl "/var/www/htdigtest/xxe_installatie.pdf" "application/pdf" url results in pdf gibberish being injected into the HTML: <HTML> <HEAD> <TITLE>[url]</TITLE> </HEAD> <BODY> <PRE> %PDF-1.3 %???? 4 0 obj << /Type /Info /Producer (FOP 0.20.4) >> endobj 5 0 obj << /Length 1830 /Filter [ /ASCII85Decode /FlateDecode ] .... while as ./pdf2html.pl "/var/www/htdigtest/xxe_installatie.pdf" "application/pdf" url works somehow better, still injecting error codes into the html output however: <HTML> <HEAD> <TITLE>[url]</TITLE> </HEAD> <BODY> Client-installatie XXE <p> by <p> 1. Download <p> Warning: <br> het te downloaden bestand hangt af van uw besturingssysteem. Voor Windows-gebaseerde systemen, download het ZIP <br> bestand, voor Linux: download het TAR.GZ bestand Malformed UTF-8 character (unexpected continuation byte 0xb7, with no preceding start byte) in substitution (s///) at ./pdf2html.pl line 117, <CAT> line 15. Malformed UTF-8 character (unexpected continuation byte 0xb7, with no preceding start byte) in substitution (s///) at ./pdf2html.pl line 117, <CAT> line 15. The other utilities however simply don't get invocated by doc2html, resulting in UNABLE TO CONVERT error messages. I'm utterly clueless w.r.t. Perl, so any guidance would be very much welcomed. Cheers, </Steven> -- Steven Noels http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center Read my weblog at http://blogs.cocoondev.org/stevenn/ stevenn at outerthought.org stevenn at apache.org |