pdftohtml -c produce an output oriented by <div> and css
tags placing the text correctly where it was suposed to be,
but loosing all the tables formating... An output oriented by
<table>, <tr> and <td> would be much better!!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I use pdf2html to convert documents to read on my palm (The
current Acrobat Reader is terrible on this platform).
Tables are viewed as a list of data because the screen isn't
wide enough to display the formatting, but if it were
defined as a table, the html parser could handle it better.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=173287
complex output mode seems to produce reasonable results.
Are you not satisfied with results of pdftohtml -c ?
Logged In: NO
pdftohtml -c produce an output oriented by <div> and css
tags placing the text correctly where it was suposed to be,
but loosing all the tables formating... An output oriented by
<table>, <tr> and <td> would be much better!!
Logged In: YES
user_id=943591
this might be useful to have
Logged In: YES
user_id=1317736
I agree with the original poster.
I use pdf2html to convert documents to read on my palm (The
current Acrobat Reader is terrible on this platform).
Tables are viewed as a list of data because the screen isn't
wide enough to display the formatting, but if it were
defined as a table, the html parser could handle it better.