Hi, how are you guys?
I'm using pdftohtml and am very happy about it,
but for my project it's pretty essential I extract from the pdf file non-textual objects as well - lines most importantly. Pdf lines are not graphics, i.e. they are not jpeg's or something of that sort, but are not textual, of course.
Is there a way to catch them with pdftohtml?
Can I make any modifications to get me on the right track?
Thanks A lot,
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.