-
Is there a roadmap or a specified release date for the next version?
As jPod can also manipulate PDFs, would it also be possible to replace the text by two "real" characters?
best regards Andy.
2009-07-13 06:17:18 UTC in jPod intarsys PDF library
-
Hi,
sorry. The Acrobat Standard tricked me out on this. I compared the xtracted text and found out, that both, the text from the previous page and the one from the current page is in the PDF for this single page.
So this isn't a jPod problem at all. I guess the problem is caused by iText which is used to split the documnent into single pages. Depending on the way, a PDF is build, it might...
2009-07-13 06:12:25 UTC in jPod intarsys PDF library
-
There you go:
https://sourceforge.net/tracker/?func=detail&aid=2819407&group_id=203731&atid=986773
https://sourceforge.net/tracker/?func=detail&aid=2819407&group_id=203731&atid=986773
thanks again for the fast response
Andy.
2009-07-10 06:14:25 UTC in jPod intarsys PDF library
-
The given PDF contains text (which Adobe Acrobat calls "hidden text"). When using CSTextExtract, this hidden text, instead of the visible text is returned. I'd like to get both, hidden and visible text.
2009-07-10 06:13:13 UTC in jPod intarsys PDF library
-
Using the CSTextExtract example, the ligature in the given PDF is recognized as f and not fi.
2009-07-10 06:11:25 UTC in jPod intarsys PDF library
-
Upload where? Forum, Bugtracker, CVS?
thanks in advance,
Andy.
2009-07-09 16:31:18 UTC in jPod intarsys PDF library
-
Hi everyone,
as I said before, we're using jPod to extract Text from PDF-Files. We just encountered a PDF which contains "hidden text". Since this text is wrong, we'd also like the extract the visible text. By default, jPod only reads the hidden text (this is what we encounter). Is it possible to also extract the visible text?
-thanx in advance
(We could provide a sample PDF.)
2009-07-02 09:56:59 UTC in jPod intarsys PDF library
-
Hi everyone,
I'm using jPOD to extract text from given PDFs. In general, this works very well, but once a text contains a ligature (using one character for ff or fl or fi) ir only contains the second character.
I can provide an example PDF which demonstrates the problem.
Any ideas?
best regards
Andreas Haufler.
2009-07-02 08:28:18 UTC in jPod intarsys PDF library