#117 PDF files are reread several times and analyzed forever

closed-fixed
nobody
analyzers (34)
5
2011-07-05
2011-07-05
Rafał Rzepecki
No

PDF analyzer currently reads the entire file again whenever it runs out of data. This causes it to run essentially forever on large files (see, for example, [1]). Attached patch (against 33da94dfb4732a47ae5c02c194ae876808e0b493 from the KDE repository) solves the problem by only reading the part that is needed. This shortens time to analyze [1] to 10s from so long that I didn't have enough patience to wait.

Note that I have done some rudimentary testing and it seems ok, but if you have any tried and known test cases you might want to check if there isn't any off by one there.

[1] http://dl.dropbox.com/u/28658707/hugepdf.pdf (beware, 55MiB)

Discussion

  • Commit 672c8b7b3cb4400bc505421f4cc70cf742ed9df0

     
    • status: open --> closed-fixed