#80 v0.39 win: pdftohtml -xml VS pdftohtml -xml -enc UTF-8

open
nobody
None
5
2006-12-11
2006-12-11
Cetin Sert
No

Dear Developers,

I have discovered a minor issue with pdftohtml v0.39. When I run:

INCORRECT
>pdftohtml -xml sample.pdf sample

the resulting sample.xml is not well-formed.

If I specify the output text encoding as in:

CORRECT
>pdftohtml -xml -enc UTF-8 sample.pdf sample

the resulting sample.xml is well-formed!

Best Regards,
Cetin Sert

Discussion

  • Cetin Sert
    Cetin Sert
    2006-12-11

    sample pdf

     
    Attachments