PoDoFo / Tickets / #7 CVE-2017-8053 - infinite recursion and stack consumption

Mattia Rizzolo - 2018-02-24

https://bugs.debian.org/860994

This CVE is also already partly fixed in r1834 for the infinite recursion part.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

zyx - 2018-03-11

Right, trying (runing podofopdfinfo on) the provided pdf on current trunk (revision 1909) doesn't cause any memory issue. I propose to close thig ticker in favour of the one mentioned by Mattia above. The part about stack usage due to recursion when reading file is filled as ticket #15.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Mattia Rizzolo - 2018-03-11

The part about stack usage due to recursion when reading file is filled as ticket #15.

Is it really the same stack consumption bug?
ISTM this is about PoDoFo::PdfParser::ReadDocumentStructure(), whereas CVE-2018-8002 is about PoDoFo::PdfParserObject::ParseFileComplete()

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

zyx - 2018-03-13

As I said, I cannot reproduce on r1909 with the given reproducer pdf file and given steps. You found it's fixed in r1834. Thus there's basically nothing to do here. But the description mentions two issues, "infinite recursion" and "stack consumption". The "infinite recursion" part is done. The "stack consumption" part is not done, even, if the "infinite recursion" happened on the stack, then it makes sense it aborts on stack overflow. I mean, either we link together this and the ticket #15 for the "stack consumption" part, or we consider both parts being tight together enough to treath them as one thing, not as two. In either way, there is nothing to work on here, from my point of view, thus this ticket can be closed.

A confirmation from someone else whom be able to test this against current development version will be highly appreciated, because there's always a chance that I did test this incorrectly. That's why I didn't close this ticket yet.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MarkR - 2018-04-16

I've added unit tests for this - the fix prevents the case where the /Prev entry in an XRef stream points back to itself:

xref 0 1 000000000 65535 2 0 obj << /Type XRef /Prev offsetXrefStmObj2 >> stream ... endstream trailer << /Root 1 0 R /Size 3 >> startxref offsetXrefStmObj2 %%EOF

but it does not fix the case where there's a /Prev loop in a series of XRef streams (this still overflows the stack due to deeply nested calls to ReadXRefContents and ReadXRefStreamContents)

xref 0 1 000000000 65535 2 0 obj << /Type XRef /Prev offsetXrefStmObj3 >> stream ... endstream 3 0 obj << /Type XRef /Prev offsetXrefStmObj2 >> stream ... endstream trailer << /Root 1 0 R /Size 3 >> startxref offsetXrefStmObj2 %%EOF

There was an identical issue with the /Prev entry in the trailer dictionary which is used for finding previous cross-reference tables. Thid was fixed by a recursion guard in PdfParser::m_nReadNextTrailerLevel which is incremented/decremented in the recursive calls to ReadNextTrailer().

I can think of three alternative fixes:

Simplest fix is renaming PdfParser::m_nReadNextTrailerLevel to PdfParser::m_nRecursionDepth and placing the same recursion checks from ReadNextTrailer() into ReadXRefContents and ReadXRefStreamContents . That's better than adding a new variable because all the methods share the same stack so should use the same stack depth count. This is easily extended to other recursive methods in PdfParser.

Adding a recursion depth parameter to all affected methods, with the same depth checks as above, but this might cause ABI issues.

Re-structuring the code to avoid recursion and follow the /Prev chains in a loop (with a maxium count) inside ReadDocumentStructure() or ReadXRefContents(). This looks like a much bigger change which needs a lot of testing and could create other problems

Last edit: MarkR 2018-04-16
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MarkR - 2018-04-18

I've just posted my parser unit tests to the mailing list. There's still a stack overflow which r1920 didn't fix (running the unit tests will show it). The cause is a set of 10,000 XRef streams all with valid /Prev entries but no loops:

2 0 obj << /Type XRef >> stream ... endstream 3 0 obj << /Type XRef /Prev offsetStreamObj(2) >> stream ... endstream 4 0 obj << /Type XRef /Prev offsetStreamObj(3) >> stream ... endstream ... N 0 obj << /Type XRef /Prev offsetStreamObj(N-1) >> stream ... endstream

There's also another issue with using offsets to check for cycles - the same XRef stream can be reached by different offsets due to the way whitespace and comments are discarded by the tokenizer:

% this XRef stream can be reached by many different offsets % 3 offsets pointing to the % comment tokens, others to leading % whitespace and one offset pointing exactly to "2 0 obj" below 2 0 obj << /Type XRef >> stream ... endstream
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MarkR - 2018-04-18

Here's a recursion guard that modifies the patch I sent in 2012 for stack overflow when following trailer /Prev chains - a very similar issue to this CVE:
https://sourceforge.net/p/podofo/mailman/message/29548894/

It works by adding an new RIAA class called PdfRecursionGuard, and to use it you just add the following line to any PdfParser methods you need to protect:
PdfRecursionGuard guard(m_nRecursionDepth);

The patch adds PdfRecursionGuard to ReadXRefContents and ReadXRefStreamContents, and replaces the existing guard on ReadNextTrailer (which was the problem my 2012 patch fixed). This fixes the problem with 10,000 XRef streams in the comment above.

The code is very simple so it should be easy to audit for problems, and it can be used alongside other existing patches. It also guards against undicovered recursion problems in the methods it's placed in

patch-PdfRecursionGuard.diff

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

zyx - 2018-04-19

Dom, you took this issue, thus I do not want to step on your toes, but I made a quick look at Mark's change and it looks harmless, it basically extracts existing code into a special class and uses it on two places. It might be safe to include in this stage before release, from my point of view.

What do you think, Dom?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dominik Seichter - 2018-04-30

Hi zyx, Hi MarkR!

Yes, I agree. The patch looks fine and it might even make sense to move PdfRecursionGuard to a source file of its own instead of keeping it inside of PdfParser. As soon as we need it somewhere else, we should move it outside. For now, I will commit the patch as revision 1924! @MarkR: Thanks!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

zyx - 2018-05-02

status: open --> closed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

CVE-2017-8053 - infinite recursion and stack consumption

A PDF parsing, modification and creation library.

Milestone

Searches

Help

#7 CVE-2017-8053 - infinite recursion and stack consumption

Related

Discussion