When reading the attached PDF the exception below is thrown.
The reason and fix for this is, that before the EI there is neither a LF
nor a blank but a CR character. The released version of jpob can be fixed
by adding || (next == 13) to the if:
CSContentParser (303-310):
/*
* spec is not clear but some internet articles claim that before
* "EI" a line break is required but spaces have been seen in real
* world documents; accept LF and space as possible end and check if
* valid operation follows.
*/
if ((next == '\n') || (next == ' ') || (next == 13)) {
// remember position
Exception:
de.intarsys.pdf.cos.COSRuntimeException:
de.intarsys.pdf.parser.COSLoadError: EI expected at character index 18420
at de.intarsys.pdf.content.CSContent.createFromBytes(CSContent.java:111)
at de.intarsys.pdf.content.CSContent.createFromCos(CSContent.java:125)
at de.intarsys.pdf.pd.PDPage.getContentStream(PDPage.java:389)
at
scireum.common.pdf.content.ExtractText.extractText(ExtractText.java:34)
at
scireum.common.pdf.content.ExtractText.extractText(ExtractText.java:57)
at
scireum.common.pdf.ExtractTextTest.expectTermsNotInFile(ExtractTextTest.jav
a:73)
at
scireum.common.pdf.ExtractTextTest.testExtract(ExtractTextTest.java:40)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod
.java:44)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.
java:15)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.j
ava:41)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.ja
va:20)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:2
8)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.ja
va:73)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.ja
va:46)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:2
8)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestRe
ference.java:45)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:
38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestR
unner.java:460)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestR
unner.java:673)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner
.java:386)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunne
r.java:196)
Caused by: de.intarsys.pdf.parser.COSLoadError: EI expected at character
index 18420
at
de.intarsys.pdf.parser.CSContentParser.parseOperationEI(CSContentParser.jav
a:404)
at
de.intarsys.pdf.parser.CSContentParser.parseStream(CSContentParser.java:465
)
at
de.intarsys.pdf.parser.CSContentParser.parseStream(CSContentParser.java:433
)
at de.intarsys.pdf.content.CSContent.createFromBytes(CSContent.java:107)
... 30 more
best regards
Andy
Elfi Heck
None
None
Public
|
Date: 2009-10-19 14:27 I've now changed our code as you propose to also accept a single CR as |
| Filename | Description | Download |
|---|---|---|
| 5.pdf | Example file | Download |
| Field | Old Value | Date | By |
|---|---|---|---|
| close_date | - | 2009-10-19 14:27 | eheck |
| resolution_id | None | 2009-10-19 14:27 | eheck |
| status_id | Open | 2009-10-19 14:27 | eheck |
| assigned_to | nobody | 2009-10-19 14:23 | eheck |
| File Added | 346864: 5.pdf | 2009-10-16 06:55 | scireumaha |