[PDFBox-user] Error converting date:CPY Document Mod Date ?
Brought to you by:
benlitchfield
From: Dmitry G. <DGo...@at...> - 2008-03-20 18:01:14
|
Hi, I was wondering if anyone has any info on the following error I get: java.io.IOException: Error converting date:CPY Document Mod Date at org.pdfbox.util.DateConverter.toCalendar(DateConverter.java:254) at org.pdfbox.util.DateConverter.toCalendar(DateConverter.java:134) at org.pdfbox.cos.COSDictionary.getDate(COSDictionary.java:797) at org.pdfbox.pdmodel.PDDocumentInformation.getModificationDate(PDDocumentInformation.java:254) at com.attivio.textextraction.pdf.PdfBoxTextExtractor.getMetadata(PdfBoxTextExtractor.java:180) at com.attivio.textextraction.pdf.PdfBoxTextExtractor.extractText(PdfBoxTextExtractor.java:90) at com.attivio.textextraction.misc.TextExtractionRunner.processFile(TextExtractionRunner.java:235) at com.attivio.textextraction.misc.TextExtractionRunner.main(TextExtractionRunner.java:139) Any clues as far as the date value may be coming in with this invalid value of "CPY Document Mod Date"? The debugger shows the following execution pattern, it looks like the COSDictionary gets headers for its data rather than legit values: COSDictionary.getDate(COSName) line: 796 COSString date = (COSString)getDictionaryObject( key ); // key = "ModDate" COSDictionary.getDictionaryObject(COSName key) items has the following: COSName{ModDate}=COSString{CPY Document Mod Date}, COSName{Subject}=COSString{CPY Document Subject}, COSName{Creator}=COSString{CPY Document Creator}, COSName{ModDate--Text}=COSString{CPY Document Mod Date}, COSName{Author}=COSString{CPY Document Author}, COSName{Producer}=COSString{CPY Document Producer}, COSName{CreationDate}=COSString{CPY Document Creation Date}, COSName{Title}=COSString{CPY Document Title}, COSName{Keywords}=COSString{CPY Document Keywords}, COSName{CreationDate--Text}=COSString{CPY Document Creation Date}} returns COSString{CPY Document Mod Date} passed into DateConverter.toCalendar(COSString date) with the value of "CPY Document Mod Date" PDDocumentInformation.getModificationDate() line: 254 PdfBoxTextExtractor.getMetadata(String, List<AbstractProperty<?>>, PDDocument) line: 180 PdfBoxTextExtractor.extractText(DocumentPayload) line: 90 TextExtractionRunner.processFile(String) line: 235 TextExtractionRunner.main(String[]) line: 139 in a normal case, the date value would be something like: {COSName{ModDate}=COSString{D:20061027110904-05'00'} Any info would be appreciated. Thanks. - Dmitry |