#8 Make it scaleable for multicore and multithread

open
nobody
None
5
2009-01-22
2009-01-22
André Mussche
No

For memory intensive application, with multiple threads, fastmm does not scale very well: no 2x improvement if application is run on dual core instead of single core. This is because of a single point / hotspot in the code:

function FastGetMem(ASize: Integer): Pointer;
...
{Try to lock the small block type}
if LockCmpxchg(0, 1, @LPSmallBlockType.BlockTypeLocked) = 0 then
Break;
...
end;

Such a locking is bad for scaling, read some interesting blogs about this:
http://www.bluebytesoftware.com/blog/2009/01/09/SomePerformanceImplicationsOfCASOperations.aspx
http://www.bluebytesoftware.com/blog/2009/01/13/SomePerformanceImplicationsOfCASOperationsRedux.aspx

I made a quick fix to give each thread it's own memory pool, so no need for locking anymore. Result: scales almost perfectly and much faster!

Discussion

  • 1. Use additional thread index slot to assign its own memory pool
    2. Always create memory from its own pool
    3. When release memory, if it is not found in its own pool, search global pool list to release it. To minimized the search, use the few low order bit to identify it pool/pool group

     
  • gol
    gol
    2010-09-15

    I'm pretty sure this is not the problem at all. The problem is in Sleep, which is pretty bad. It has an option to spin instead of sleeping, which is good, but better would be an option to spin, and eventually wait for an event. This is what a critical section does. But sleeping.. is very bad.

     
  • andre mussche
    andre mussche
    2010-09-15

    Any locking is bad, regardless of sleep or spinning.

    I use TopMM for multicore apps, scales much better:
    http://www.topsoftwaresite.nl/