Menu

#46 TextHighlightSample NumberFormatException: For input string:

0.1.2.1
open
None
3
2015-04-28
2013-04-19
Arun
No

Error while running TextHighlightSample.java with the attached pdf

Exception in thread "main" java.lang.NumberFormatException: For input string: "0D280D4D0D26"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:495)
at org.pdfclown.documents.contents.fonts.CMapParser.parseUnicode(CMapParser.java:204)
at org.pdfclown.documents.contents.fonts.CMapParser.parse(CMapParser.java:112)
at org.pdfclown.documents.contents.fonts.Font.load(Font.java:734)
at org.pdfclown.documents.contents.fonts.Font.<init>(Font.java:351)
at org.pdfclown.documents.contents.fonts.CompositeFont.<init>(CompositeFont.java:123)
at org.pdfclown.documents.contents.fonts.Type2Font.<init>(Type2Font.java:57)
at org.pdfclown.documents.contents.fonts.Font.wrap(Font.java:261)
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:72)
at org.pdfclown.documents.contents.FontResources.wrap(FontResources.java:40)
at org.pdfclown.documents.contents.ResourceItems.get(ResourceItems.java:119)
at org.pdfclown.documents.contents.objects.SetFont.getResource(SetFont.java:119)
at org.pdfclown.documents.contents.objects.SetFont.getFont(SetFont.java:83)
at org.pdfclown.documents.contents.objects.SetFont.scan(SetFont.java:97)
at org.pdfclown.documents.contents.ContentScanner.moveNext(ContentScanner.java:1330)
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.extract(ContentScanner.java:811)
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:777)
at org.pdfclown.documents.contents.ContentScanner$TextWrapper.<init>(ContentScanner.java:765)
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.get(ContentScanner.java:690)
at org.pdfclown.documents.contents.ContentScanner$GraphicsObjectWrapper.access$500(ContentScanner.java:679)
at org.pdfclown.documents.contents.ContentScanner.getCurrentWrapper(ContentScanner.java:1154)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:632)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:647)
at org.pdfclown.tools.TextExtractor.extract(TextExtractor.java:296)
at org.pdfclown.samples.cli.TextHighlightSample.run(TextHighlightSample.java:59)
at org.pdfclown.samples.cli.TextHighlightSample.main(TextHighlightSample.java:33)</init></init></init></init></init>

1 Attachments

Discussion

  • Stefano Chizzolini

    • Group: v1.0_(example) --> 0.1.2.1
     
  • Stefano Chizzolini

    The sample file contains a composite font (JKANVA+Liya) whose character-to-unicode map (ToUnicode CMap) points to some Unicode multi-character sequences like this Malayalam combination:

    <0159> <0D280D4D0D26>
    

    PDF Clown (0.1.2) is currently limited to Unicode single-character sequences -- to fix.

     

    Last edit: Stefano Chizzolini 2015-04-28

Log in to post a comment.

MongoDB Logo MongoDB