I am using std::map with "simple" objects such as std::map<int,int>
I experimented using the loki STL allocator, as per
std::map<int,int,std::less<int> ,::Loki::LokiAllocator< std::pair<const int,int> > >
On Windows 32, Visual C++ 2008, Windows XP , and with _SECURE_SCL=0 in Release mode ( _SECURE_SCL=0 is essential for decent performance of Visual C++ STL), I don't see any performance benefits when inserting a large number of entries. Actually using the Loki allocator is up to 30% slower.
Note, it is important that benchmarks, even in Release mode, NOT be run under the Visual C++ debugger as it affects the timing quite a lot.
Am I missing something here?
I'm pleased to let you know that there is now a bug report for SmallObj performance. I added code to Loki which checks two pointers into its Chunk list before looking through the Chunk list. This simple check reduces many delete calls from having linear runtime performance to constant runtime performance. Occasionally, the allocator must still look through its Chunk list to find the address being deleted, but it should do that far less often now.
You can get this fix by doing a checkout of the latest version of code from subversion. The bug fix will also go into the next release of Loki.
Thanks… I will give some feedback on the change when I get a chance.
Even the latest Loki the Loki::Allocator is outperformed by system malloc and free on Windows XP x64. I am trying for example with objects of size 256 bytes using a single class , and calling the static new() and delete() functions
typedef Loki::SmallObject< ::Loki::ClassLevelLockable, 256*1024, 256, 16> MyAllocatorSingleton;
void* tmp= MyAllocatorSingleton::operator new(size);
LOKI 2.07813 2.03369
malloc/free 1.57813 1.59647
If anyone is interested, I did discover a very high performance 'small object' thread-safe allocator that offers significant improvement over the system malloc and free. Its part of Intel's TBB ( threaded building blocks) and can be used in isolation from other parts of the library.
Thanks for the info. I might have time to look at TBB later.
It's because your test is a corner case for the loki small object allocator. The loki allocator improves performance if you recycle the allocated objects and in your test you only allocated.
Log in to post a comment.