From: Douglas K. <kl...@he...> - 2004-05-26 20:37:08
|
> In my version of doc2html.pl the magic number '^\320\317\021\340' is the > test for a Microsoft file (Word, Excel or Powerpoint). > > The test for a PDF file should use '%PDF-|\0PDF CARO\001\000\377'. > > I've downloaded doc2html.pl and have been experimenting with it to process > pdf > > files. I've found that pdf2html.pl works but doc2html.pl which should be > > calling pdf2html.pl doesn't work and isn't calling pdf2html.pl (I've > edited > > both files to fill in local pathnames). I think that it's comparing the > magic > > number '^\320\317\021\340' given in the store_methods sub-routine for pdf > files > > with the beginning of the file which it reads in the read_magic > sub-routine and > > finding they don't match. Before I pursue this further, is this a known > > problem? TIA. I got the wrong number in my message. Sorry. I might have picked up the wrong part of the file with the mouse. The magic number for a pdf file given in doc2html.pl is what you've written. I've run an octal dump on the pdf file and found that it begins with % P D F - 1 . 4 \r % 342 343 317 323 If the "|" in the magic number it's looking for is the alternation symbol, then this should match. Is there some reason it wouldn't? TIA. Douglas ======== Douglas Kline kl...@he... |