Object reference not set to an instance of an object
General-Purpose PDF Library for Java and .NET
Status: Beta
Brought to you by:
stechio
Error when try to extract text from the attached PDF.
EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object. at org.pdfclown.documents.contents.fonts.Font.get_Flags() at org.pdfclown.documents.contents.fonts.SimpleFont.LoadEncoding() at org.pdfclown.documents.contents.fonts.SimpleFont.OnLoad() at org.pdfclown.documents.contents.fonts.Font.Load() at org.pdfclown.documents.contents.fonts.Font..ctor(PdfDirectObject baseObject) at org.pdfclown.documents.contents.fonts.SimpleFont..ctor(PdfDirectObject baseObject) at org.pdfclown.documents.contents.fonts.TrueTypeFont..ctor(PdfDirectObject baseObject) at org.pdfclown.documents.contents.fonts.Font.Wrap(PdfDirectObject baseObject) at org.pdfclown.documents.contents.FontResources.Wrap(PdfDirectObject baseObject) at org.pdfclown.documents.contents.ResourceItems`1.get_Item(PdfName key) at org.pdfclown.documents.contents.objects.SetFont.GetResource(IContentContext context) at org.pdfclown.documents.contents.objects.SetFont.GetFont(IContentContext context) at org.pdfclown.documents.contents.objects.SetFont.Scan(GraphicsState state) at org.pdfclown.documents.contents.ContentScanner.MoveNext() at org.pdfclown.documents.contents.ContentScanner.TextWrapper.Extract(ContentScanner level) at org.pdfclown.documents.contents.ContentScanner.TextWrapper..ctor(ContentScanner scanner) at org.pdfclown.documents.contents.ContentScanner.GraphicsObjectWrapper.Get(ContentScanner scanner) at org.pdfclown.documents.contents.ContentScanner.get_CurrentWrapper() at org.pdfclown.tools.TextExtractor.Extract(ContentScanner level, IList`1 extractedTextStrings) at org.pdfclown.tools.TextExtractor.Extract(IContentContext contentContext) at Digitaldoc.WebAPI.Services.Extractors.PdfToText.Extract()
Thanks for your attention
The attached document has fonts without font descriptors, despite the PDF spec 1.7 requires them. The fix introduces a more relaxed behavior which tolerates such spec violation.
Fixed on 0.1.2-Fix branch (rev 216) and 0.2.0 trunk (rev 217).
thank you