I spent a day switching away from pdfbox to jPod and adapted this library for my use case. I'm dumping this code over the fence without much support and I'm hoping that it is useful.
Provided are:
Multipage TIFF reader where CCITT streams are embedded into PDImages without decoding the TIFF other than at the tag level. This allows using jPod to implement something like tiff2pdf.
JPEG reader where JPEG is embedded into PDImage without decoding it (other than looking up its width/height).
Lossless image support from any BufferedImage, for e.g. PNG. I cut all corners and just made it RGB image with full 8-bit alpha rather than trying to optimize it. I was pressed for time and I mostly deal with 32-bit ARGB images.
Very crude but nevertheless functioning unicode-encoded TTF loader that permits writing international text with jPod. I can't guarantee it's bug-free or anything, obviously it leaves much to be desired. The most important problem is its missing ToUnicode cmap.
While doing this work, I was impressed by the overall cleanliness and orderliness of the code.
However, there are a few details that should be changed when it comes to unicode text handling. Firstly, it's not correct to prefer char[] as the unicode string representation because of surrogate pairs and code point values above 65536. The appropriate approach is to use String's method codePointAt(n) and then increment n by Character.charCount(codePoint). I think the optimal representation is a String instance. There is a sample of this approach in the anonymous inline Encoding class.
Secondly, I had to use reflection to bang the cachedEncoding into PDFontType0. Trying to call setEncoding() resulted in a CMapEncoding being used rather than my own minimal encoder. I suspect that this issue has been fixed in the library release 5.6.0, but for some reason this version is not available in maven repository.
I was able to add the cmap generation too.
Something like this:
Thank you for your support.
I hope we can come back to your submission with our next release. To enable us to do so, please decorate your code/posting clearly with lesser BSD license agreement.
Currently we are very busy, so please forgive any delay.
Maven repository is not supported by us. We currently only provide code as a file download on this site. The maven submission is managed by a third party supporter.
Regards, Michael
mtraut mtraut@users.sf.net kirjoitti 10.6.2014 kello 10.24:
Antti
OK, this is the new version. I added the 3-clause BSD license (which is what I assume you mean by "lesser") as a javadoc for the class.
... and excuse the name. This was supposed to be called jpodimprovements but I was just studying recent changes in pdfbox when I prepared the file.