#88 Infinite loop in HtmlPage::endString

open
nobody
None
5
2008-02-05
2008-02-05
Jan Adámek
No

When parsing <http://www.ciur.cz/fileadmin/main/_files/IOM-Split kazeta DCM24_03.pdf> page 41, there occurs an infinite loop in HtmlPage::endString in:
for (p1 = NULL, p2 = yxStrings; p2; p1 = p2, p2 = p2->yxNext) {
if (y1 < p2->yMin || (y2 < p2->yMax && curStr->xMax < p2->xMin))
break;
}

Found in version 0.36. Poppler library used is 0.6.0.

KPDF is able to display the page with the same poppler library in about two minutes (it’s much but at least it’s finite) on Intel Core 2 1.66 GHz.

In a PDF downloading application based on pdftohtml, it looped for over a day on 2 GHz Xeon with no change, backtrace always showed the same position. I haven’t tested whether the loop occurs only at this very position or leaves this function, returns back, and stays for most time there.

Relevant backtrace:
#0 0x00000000004113cb in HtmlPage::endString (this=0x4c46cb0) at poppler/HtmlOutputDev.cc:294
#1 0x00002b9d8c06f0f7 in Gfx::doShowText () from /usr/lib/libpoppler.so.2
#2 0x00002b9d8c06f98d in Gfx::opShowSpaceText () from /usr/lib/libpoppler.so.2
#3 0x00002b9d8c06cc28 in Gfx::go () from /usr/lib/libpoppler.so.2
#4 0x00002b9d8c06d075 in Gfx::display () from /usr/lib/libpoppler.so.2
#5 0x00002b9d8c0aee25 in Page::displaySlice () from /usr/lib/libpoppler.so.2
#6 0x00002b9d8c0aeebd in Page::display () from /usr/lib/libpoppler.so.2
#7 0x00002b9d8c0b0735 in PDFDoc::displayPages () from /usr/lib/libpoppler.so.2

Discussion