[pdftohtml] output is (mostly) nonsense

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This is a resend.  Since writing it, I found your archives, and I now
see why you don't allow non-member posting.  I assume you'll get through
all that spam to find the real posts some time in the next couple
years... in the mean time, I'll post this as a member.

- --
Hi,

Please excuse me if there is an archive for this list; I couldn't find
one or links to one on http://pdftohtml.sourceforge.net/.

I'm using pdftohtml for the first time, and having looked through the
man page, and tried many different configurations of command line
options, I'm getting nothing like the pdf document.

The html is fine, index and links and all, but the content of the pages
looks like the following (I have a screenshot I could send, if that
would help):

!
" #
$ $%
! !
!!&$"
$! &$"
$! &
$
" !
$ $ '
$$
Every once in a while (almost once per page, but not quite), there is a
line, and sometimes a paragraph, of text from the pdf.  The number of
pages is correct.

I tried with -enc UTF-8, but it looks like there isn't a switch for
input encoding, if I felt adventurous enough to play with that.

Anyway, I'm assuming there is something straightforward that I'm
missing, but I'm not sure what, and I haven't found this discussed.

btw, I'm running Ubuntu 6.06.

- --
Kent Rasmussen
SIL Eastern Congo Group Linguist
020 608593/4/5 x130
0733-710235(office)
0722-620510(office)
0735-539687(Personal)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFFI1p5c7tUjlKyxNMRAui6AKCQAomj+1Z0KUSwD+GmytDBsHQGpwCgmYQ1
7R8iDG2q8Hi4DJ8OS48lF7s=
=1Lo6
-----END PGP SIGNATURE-----