Loki STL allocator performance

Help
loopless
2010-08-27
2013-04-08
  • loopless
    loopless
    2010-08-27

    I am using std::map with "simple" objects such as std::map<int,int>

    I experimented using the loki STL allocator, as per

    std::map<int,int,std::less<int> ,::Loki::LokiAllocator< std::pair<const int,int> > >

    On Windows 32, Visual C++ 2008, Windows XP , and with _SECURE_SCL=0 in Release mode ( _SECURE_SCL=0 is essential for decent performance of Visual C++ STL), I don't see any performance benefits when inserting a large number of entries. Actually using the Loki allocator is up to 30% slower.

    Note, it is important that benchmarks, even in Release mode, NOT be run under the Visual C++ debugger as it affects the timing quite a lot.

    Am I missing something here?

    Andrew

     
  • Hi Andrew,

    I'm pleased to let you know that there is now a bug report for SmallObj performance.  I added code to Loki which checks two pointers into its Chunk list before looking through the Chunk list.  This simple check reduces many delete calls from having linear runtime performance to constant runtime performance.  Occasionally, the allocator must still look through its Chunk list to find the address being deleted, but it should do that far less often now.

    You can get this fix by doing a checkout of the latest version of code from subversion.  The bug fix will also go into the next release of Loki.

    Cheers,

    Rich

     
  • loopless
    loopless
    2010-09-08

    Hi Rich,
    Thanks… I will give some feedback on the change when I get a chance.

    Andrew

     
  • loopless
    loopless
    2010-11-16

    Even the latest Loki the Loki::Allocator is outperformed by system malloc and free on Windows XP x64. I am trying for example with objects of size 256 bytes using a single class , and calling the static new() and delete() functions

    typedef Loki::SmallObject< ::Loki::ClassLevelLockable, 256*1024, 256, 16> MyAllocatorSingleton;
    e.g.
    void* tmp= MyAllocatorSingleton::operator new(size);

    LOKI 2.07813      2.03369
    malloc/free 1.57813    1.59647

     
  • loopless
    loopless
    2010-11-19

    If anyone is interested, I did discover a very high performance 'small object' thread-safe allocator that offers significant improvement over the system malloc and free. Its part of Intel's  TBB ( threaded building blocks) and can be used in isolation from other parts of the library.

     
  • Thanks for the info.  I might have time to look at TBB later.

    Cheers,

    Rich