Loki STL allocator performance

  • loopless

    loopless - 2010-08-27

    I am using std::map with "simple" objects such as std::map<int,int>

    I experimented using the loki STL allocator, as per

    std::map<int,int,std::less<int> ,::Loki::LokiAllocator< std::pair<const int,int> > >

    On Windows 32, Visual C++ 2008, Windows XP , and with _SECURE_SCL=0 in Release mode ( _SECURE_SCL=0 is essential for decent performance of Visual C++ STL), I don't see any performance benefits when inserting a large number of entries. Actually using the Loki allocator is up to 30% slower.

    Note, it is important that benchmarks, even in Release mode, NOT be run under the Visual C++ debugger as it affects the timing quite a lot.

    Am I missing something here?


  • Richard Sposato

    Richard Sposato - 2010-09-08

    Hi Andrew,

    I'm pleased to let you know that there is now a bug report for SmallObj performance.  I added code to Loki which checks two pointers into its Chunk list before looking through the Chunk list.  This simple check reduces many delete calls from having linear runtime performance to constant runtime performance.  Occasionally, the allocator must still look through its Chunk list to find the address being deleted, but it should do that far less often now.

    You can get this fix by doing a checkout of the latest version of code from subversion.  The bug fix will also go into the next release of Loki.



  • loopless

    loopless - 2010-09-08

    Hi Rich,
    Thanks… I will give some feedback on the change when I get a chance.


  • loopless

    loopless - 2010-11-16

    Even the latest Loki the Loki::Allocator is outperformed by system malloc and free on Windows XP x64. I am trying for example with objects of size 256 bytes using a single class , and calling the static new() and delete() functions

    typedef Loki::SmallObject< ::Loki::ClassLevelLockable, 256*1024, 256, 16> MyAllocatorSingleton;
    void* tmp= MyAllocatorSingleton::operator new(size);

    LOKI 2.07813      2.03369
    malloc/free 1.57813    1.59647

  • loopless

    loopless - 2010-11-19

    If anyone is interested, I did discover a very high performance 'small object' thread-safe allocator that offers significant improvement over the system malloc and free. Its part of Intel's  TBB ( threaded building blocks) and can be used in isolation from other parts of the library.

  • Richard Sposato

    Richard Sposato - 2010-12-07

    Thanks for the info.  I might have time to look at TBB later.



  • Jansen du Plessis

    It's because your test is a corner case for the loki small object allocator. The loki allocator improves performance if you recycle the allocated objects and in your test you only allocated.


Log in to post a comment.