- assigned_to: nobody --> nnassar
classes :
com.etymon.pj.PdfParser
com.etymon.pj.object.PjNumber
In PdfParser, (lines 339 - 343), the parse( ) method
pushes a number on the stack.
It pushes the number in a float data type (call to
PjNumber() )
If the number is greater than a certain size, the
conversion to float will loose the precision of the number.
(example : 19876543)
When the parse( ) method treats a "stream" (line 281), it
gets the value of the length of the stream
(<state._stream>).
In the case the value has lost precision by the float
conversion, the getToken( ) method
will move the position within the pdf (<state._pos>) by a
wrong value.
If the value is bigger than the actual size of the stream,
the next token could be "ndstream",
instead of "endstream". This will keep the parse( )
method to treat the "endstream" token,
it will push a string on the stack and get an Exception
with the return instruction (line 427) :
return (PjObject)(stack.pop());
The problem was experienced with a PDF containing a
high-resolution image ;
The size of the stream was 20467439
There is no attached PDF as its size is 20 MB
I propose the following workaround, but other changes
could be made to correct the situation.
PjNumber.java :
Along with a float value, the class stores a double value.
A new constructor stores the double value and the float
value :
public PjNumber(double d) {
_d = d;
_f = new Float(d).floatValue();
A new method gets the double value:
public int getIntFromDouble() {
return new Double(_d).intValue();
}
PdfParser.java :
In the parse() method, a number is stored as a double :
stack.push(new PjNumber(new Double
(state._token).doubleValue()));
In the parse() method, when a "stream" is treated, the
length of the stream is returned from the double value:
state._stream = ((PjNumber)
(obj)).getIntFromDouble();