#13 Slow rendering of very large tables

open
nobody
platypus (4)
5
2014-07-30
2003-04-28
Anonymous
No

O( n ) rendering engine for Tables would be
beneficial.

Through some very informal testing, it appears that
the rendering engine for Tables runs at O( n-
squared). While this is not a problem for tables of
up to a few thousand rows, it can become
prohibitive as the table gets larger.

Some time samples follow for my report generator
running on a 1G processor with 256 Meg RAM.
note that this is for the rendering only. The times I
recorded for the data preparation appear to be O(
n ). Also, the RAM available in the machine was
sufficient to perform these runs without requiring
significant virtual memory usage. OS is Windows
NT 4.0, SP 6.

Rows Rendering time (sec)
1000 3.72
2000 11.75
3000 25.74
4000 46.64

Ken Brubaker
Intercon Security
ken_brubaker@interconsecurity.com

Discussion

  • Logged In: NO

    Indeed. Table generation can even be very slow on my dual
    2.4GHz Xeon with 2 gigs of RAM. OS == RedHat 8, but with
    Python 2.3 and RL 1.19 .

    I've found that making sure to build _rl_accel helps, and using
    Psyco can help some more, but its still far from fast. The main
    issue is that it makes it much harder to do use ReportLab for
    functions that are interactive, such as CGI PDF generation or
    as a plug-in print system for an app.

     
  • Craig Ringer
    Craig Ringer
    2004-05-13

    Logged In: YES
    user_id=639504

    Going by your numbers, I assume you're already passing a
    rowHeights= parameter to the Table constructor. Without that,
    even a 2000-row run on my laptop (PIII-800) takes a bit over a
    minute. With rowHeights passed explicitly, execution takes five
    seconds.

    A little profiling revealed that the _calc_height method of the
    Table class was the #1 CPU time eater by far when no
    rowHeights parameter was passed.

    When I do pass RowHeights, I'm seeing much the same
    performance behaviour as you.

    Some more profiling shows that the functions that see most of
    the time when a rowHeights param is passed are the bodies of:
    Table.__init__ (50% on a 4000-row sample)
    and
    CellStyle1.__init__ (another 10%)

    BTW, my first comment on reading tables.py: "Where are the
    docstrings?!?!?".

    Anyway, if I add some print statements and run over a
    100,000 line table, part of what is happening becomes clear:

    STARTING <instance object at 0x4035794c>
    DONE: 100000 passes
    STARTING <instance object at 0x403ac0ec>
    DONE: 56 passes
    STARTING <instance object at 0x403ac1ac>
    DONE: 99944 passes
    STARTING <instance object at 0x403ac46c>
    DONE: 56 passes
    STARTING <instance object at 0x403ac30c>
    DONE: 99888 passes
    STARTING <instance object at 0x403ac4cc>
    DONE: 56 passes
    STARTING <instance object at 0x403ac88c>
    DONE: 99832 passes
    STARTING <instance object at 0x403ac5ac>
    DONE: 56 passes
    STARTING <instance object at 0x403ace4c>
    DONE: 99776 passes

    (etc). Each time Table.split is called, the table is splitting off
    another roughtly page-sized chunk, returning that, and
    creating a whole new table instance to hold the rest. That
    table has Table.split called in turn. The Table.split function
    looks like its written to be called sort of recursively like this -
    effective on a small table, less nice on a large one.

    I haven't calculated it out, but intuitively this behaviour would
    seem to be consistent with the O(N**2) completion time you
    mentioned. Each size doubling would mean twice the number
    of split calls (assuming fixed row heights), with the chunks of
    data starting off at twice the size. Ouch.

    This doesn't appear to be the only problem, but its certainly
    not helping.

    I'm looking into this more now, but I'm relatively new to Python
    and ReportLab so no results guaranteed. The use of one-char
    variables, near-total absence of comments, and lack of
    docstrings isn't helping me one bit - *sigh*.

     
  • Craig Ringer
    Craig Ringer
    2004-05-13

    Logged In: YES
    user_id=639504

    OK. I've done a gruesomely ugly special case hack as a test.
    It currently doesn't work with repeatRows, completely breaks
    styling, and probably fails on a document with anything but a
    table on it. The goal was to see if Table.split was really the
    problem - breaking everything be dammned, it's not like I'd
    ever use this test code (or show it to anybody - *ICK!*).

    Basically, I re-wrote table.split to loop over the table data and
    create new Table instances for each page-sized block, then
    return the lot in one go.

    It does seem to be VASTLY faster on larger tables, but I can't
    be sure until I fix repeatRows and styles. Even then, the
    handling of varying target drawing areas will still be utterly
    broken, making it completely useless.

    It did, however, do the job - isolating the speed problem down
    to the split method.

    If I contrast (a) the original table.py, (b) my mangled table.py,
    and (c) the orignal with the same code (for styles and
    repeatRows) removed as I've removed from the mangled one, I
    see these sort of run times:

    4000 rows:
    (a) 20s
    (b) 5s
    (c) 19s

    Note also that this is now without repeatRows (I was using
    repeatRows=1 when seeing times vaguely approximating
    yours).

    Of course, while the PDF output from my hacked up version
    _looks_ OK, it could be horribly broken in some way, too. This
    is just an early test.

    Hmm... results at 8000 rows even more promising:
    (a) 1:20
    (b) 1:19
    (c) 0:10

    The core changes I've made to do this are:

    (a) scan self._rowHeights to see if all values are equal

    (b) If the table row heights are all the same, use the "n"
    value from above to figure out how many tables we'll need to
    hold all the results.

    (c) loop over xrange(num_tables), and break out an
    "n"-sized block of rows from "data" on each loop. Create a
    new Table instance with that data, and add it to a list of tables
    to return.

    This avoids the recursive table re-creation. HOWEVER, I
    suspect that the current system does what it does for a reason.
    My best guess is that it handles varying frame sizes that way.
    After all, if you have 3 pages - one with room for 25 cols, one
    with room for 50, and one with room for another 50, you don't
    want to split into 5 tables - only 3. *arrggh*.

    There has to be a better way of doing this - either handling the
    case where all row heights are equal _and_ all frame sizes
    are equal separately, or reducing the penalty for the repeated
    Table class re-creation.

    Ideas? Maybe subclass Table to create one that can't handle
    varying sized frames etc, but is vastly faster? It could draw the
    first page normally, then for second and subsequent pages
    use the method above instead - that could work.

    My 20,000 row test just completed as I was going to send this.
    run time: 24 seconds on a PIII/800 laptop. the split method is
    definitely the problem - as this would've taken many minutes
    with the original code. In fact, a 100,000 line run just finished
    in 2 min 11 sec - all 1755 pages and 2.68MB of it :-)

    (Sorry if I'm a bit incoherent here. It's almost 1:00AM and I've
    been awake for _far_ too long).

    I'll try to assemble this into a useful post and ask on the list if I
    don't see anything here in the next few days. I'm going to be
    rather busy and won't be able to do more on it for a little while.

     
  • Craig Ringer
    Craig Ringer
    2004-05-13

    Logged In: YES
    user_id=639504

    I just did a few quick calculations on my test numbers, just out
    of curiosity. The results are interesting:

    4000 rows : 800 rows/sec
    8000 rows : 800 rows/sec
    20000 rows : 833 rows/sec
    100000 rows: 763 rows/sec

    In other words, it really looks like the repeated table
    re-splitting and re-creation really is the killer.

    I also find it amusing that I get almost exactly one row
    generated per million CPU cycles ;-)