Hi, how are you guys?
I'm using pdftohtml and am very happy about it,
but for my project it's pretty essential I extract from the pdf file non-textual objects as well - lines most importantly. Pdf lines are not graphics, i.e. they are not jpeg's or something of that sort, but are not textual, of course.
Is there a way to catch them with pdftohtml?
Can I make any modifications to get me on the right track?
Thanks A lot,
Raki
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, how are you guys?
I'm using pdftohtml and am very happy about it,
but for my project it's pretty essential I extract from the pdf file non-textual objects as well - lines most importantly. Pdf lines are not graphics, i.e. they are not jpeg's or something of that sort, but are not textual, of course.
Is there a way to catch them with pdftohtml?
Can I make any modifications to get me on the right track?
Thanks A lot,
Raki