SmallObj for allocator

Help
2007-01-05
2013-04-08
  • I have write a allocator like std::allocator, in the allocator, I used SmallObj, but in test, it's slowly than std::allocator, why it is?
    I use MingW, gcc 3.4.2, my test result is:
    deque: 63
    deque with op alloc: 94

    #----------------------------------------
    #Makefile:
    #----------------------------------------
    CPP  = g++.exe
    OBJ  = rtp_test.o
    LINKOBJ  = rtp_test.o
    LIBS =  -L"D:/MinGW/lib" -L"D:/Lib/loki-0.1.5/lib" -lloki -s
    CXXINCS =  -I"D:/MinGW/include"  -I"D:/MinGW/include/c++/3.4.2" -I"D:/Lib/loki-0.1.5/include"
    BIN  = rtp_test.exe
    CXXFLAGS = $(CXXINCS)   -O2
    RM = rm -f

    .PHONY: all all-before all-after clean clean-custom

    all: all-before rtp_test.exe all-after

    clean: clean-custom
        ${RM} $(OBJ) $(BIN)

    $(BIN): $(OBJ)
        $(CPP) $(LINKOBJ) -o "rtp_test.exe" $(LIBS)

    rtp_test.o: rtp_test.cpp
        $(CPP) -c rtp_test.cpp -o rtp_test.o $(CXXFLAGS)

    //----------------------------------------------------------------------------
    //op_alloc.h:
    //----------------------------------------------------------------------------
    #ifndef OP_ALLOC_H
    #define OP_ALLOC_H

    #include <loki/SmallObj.h>

    #define LOKI_ALLOCATOR_PARAMETERS Loki::SingleThreaded, 4096, 128, 4, Loki::NoDestroy

    namespace MyTest
    {
        template<typename _Tp>
            class op_alloc : public Loki::AllocatorSingleton< LOKI_ALLOCATOR_PARAMETERS >
        {
            public:
                typedef size_t     size_type;
                typedef ptrdiff_t  difference_type;
                typedef _Tp*       pointer;
                typedef const _Tp* const_pointer;
                typedef _Tp&       reference;
                typedef const _Tp& const_reference;
                typedef _Tp        value_type;

                template<typename _Tp1>
                    struct rebind
                    { typedef op_alloc<_Tp1> other; };

                op_alloc() throw() { }

                op_alloc(const op_alloc&) throw() { }

                template<typename _Tp1>
                    op_alloc(const op_alloc<_Tp1>&) throw() { }

                ~op_alloc() throw() { }

                pointer address(reference __x) const { return &__x; }

                const_pointer address(const_reference __x) const { return &__x; }

                // NB: __n is permitted to be 0.  The C++ standard says nothing
                // about what the return value is when __n == 0.
                pointer allocate(size_type __n, const void* = 0)
                    //{ return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp))); }
                {
                    return static_cast<_Tp*>(Allocate(__n * sizeof(_Tp), true));
                }

                // __p is not permitted to be a null pointer.
                void deallocate(pointer __p, size_type)
                    //{ ::operator delete(__p); }
                {
                    Deallocate(__p);
                }

                size_type max_size() const throw()
                { return size_t(-1) / sizeof(_Tp); }

                void construct(pointer __p, const _Tp& __val)
                { ::new(__p) _Tp(__val); }

                void destroy(pointer __p) { __p->~_Tp(); }
        };

        template<typename _Tp>
            inline bool
            operator==(const op_alloc<_Tp>&, const op_alloc<_Tp>&)
            { return true; }

        template<typename _Tp>
            inline bool
            operator!=(const op_alloc<_Tp>&, const op_alloc<_Tp>&)
            { return false; }
    } // namespace MyTest

    #endif

    //----------------------------------------------------------------------------
    //rtp_test.cpp:
    //----------------------------------------------------------------------------
    #include <windows.h>
    #include <stdio.h>

    #include <deque>
    #include "op_alloc.h"

    #define LOOP_COUNT (20*1024*1024/sizeof(rtp_pkg))

    struct rtp_pkg
    {
        unsigned char buffer[16];
    };

    typedef void ( *TESTCASE )();

    void test_suite( TESTCASE testcase )
    {
        DWORD start_time = GetTickCount();
        ( *testcase )();
        printf( "%d\n", GetTickCount() - start_time );
    }

    void test_deque()
    {
        std::deque< rtp_pkg > d;
        rtp_pkg *pkg = new rtp_pkg;

        for ( int i = 0; i < LOOP_COUNT; ++i )
        {
            d.push_back( *pkg );
        }
    }

    void test_deque_op_alloc()
    {
        std::deque< rtp_pkg, MyTest::op_alloc<rtp_pkg> > d;
        rtp_pkg *pkg = new rtp_pkg;

        for ( int i = 0; i < LOOP_COUNT; ++i )
        {
            d.push_back( *pkg );
        }
    }

    int main( int argc, char *argv[] )
    {
        printf( "deque: " );
        test_suite( &test_deque );
        printf( "deque with op alloc: " );
        test_suite( &test_deque_op_alloc );

        return 0;
    }

     
    • Hi,

      Guess it's a thing of the past, but I tried your code, and due to my use of it, it was a while before  crashes occurred, so just for the record:
          Don't inherit from Loki::AllocatorSingleton.

      This leads to explicit calls to the destructor, which is wrong. If this happens while memory is allocated, there's a crash. It's mentioned in the book, and noted in docs, but the unwary (me) may miss it.
      Btw, the SmallObj based allocator is doing quite well in my allocator shootout (contenders: boost [fast] pool alloc, STLPort and Dinkumware, under MSVC and gcc)

      Here's my version with the fix. I'm still largely unwary user so there may be other problems.

          - rasmus POINT ekman IN abc se

      ############ file loki_allocator.hpp runs to end of post ############

      #ifndef LOKI_ALLOCATOR_HPP_INCLUDED
      #define LOKI_ALLOCATOR_HPP_INCLUDED

      // Requires project to be compiled with loki/src/SmallObj.cpp and Singleton.cpp

      #include <loki/SmallObj.h>

      //-------------------------------------------------------------------

      // Adapted from anonymous post to Loki forums, fixing erroneous inheritance
      template<typename _Tp, typename AllocT = Loki::AllocatorSingleton<> >
      class loki_allocator
      {
      public:
          typedef size_t      size_type;
          typedef ptrdiff_t   difference_type;
          typedef _Tp*        pointer;
          typedef const _Tp*  const_pointer;
          typedef _Tp&        reference;
          typedef const _Tp&  const_reference;
          typedef _Tp         value_type;

          loki_allocator() throw() {}

          loki_allocator(const loki_allocator&) throw() {}

          template<typename _Tp1>
          loki_allocator(const loki_allocator<_Tp1>&) throw() {}

          ~loki_allocator() throw() { }

          template<typename _Tp1>
          struct rebind  { typedef loki_allocator<_Tp1> other; };

          pointer address(reference x) const { return &x; }

          const_pointer address(const_reference __x) const { return &__x; }

          // NB: __n is permitted to be 0. The C++ standard says nothing
          // about what the return value is when __n == 0.
          pointer allocate(size_type __n, const void* = 0)
          //{ return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp))); }
          {
              return static_cast<_Tp*>( AllocT::Instance().Allocate(__n * sizeof(_Tp), true) );
          }

          // __p is not permitted to be a null pointer.
          void deallocate(pointer __p, size_type)
          //{ ::operator delete(__p); }
          {
              return AllocT::Instance().Deallocate(__p);
          }

          size_type max_size() const throw() { return size_t(-1) / sizeof(_Tp); }

          void construct(pointer __p, const _Tp& __val) { ::new(__p) _Tp(__val); }

          void destroy(pointer __p) { __p->~_Tp(); }
      };

      //-------------------------------------------------------------------

      template<typename _Tp>
      inline bool operator==(const loki_allocator<_Tp>&, const loki_allocator<_Tp>&) { return true; }

      template<typename _Tp>
      inline bool operator!=(const loki_allocator<_Tp>&, const loki_allocator<_Tp>&) { return false; }

      //-------------------------------------------------------------------

      #endif  // LOKI_ALLOCATOR_INCLUDED

       
    • Hi Rasmus (?),

      Glad you like Loki's small object memory allocator.  Especially glad to hear it does well compared to several other memory allocators.

      I'd like to recommend a minor change to your code which will make Loki's allocator run even faster.
      Just replace this line:
      return AllocT::Instance().Deallocate(__p);
      with this one:
      return AllocT::Instance().Deallocate( __p, sizeof( _Tp ) );

      If the allocator knows the size, and can pass that into the Loki::AllocatorSingleton::Deallocate, then the allocator knows exactly which Loki::FixedAllocator in the pool contains the address getting deallocated.  Otherwise it must look for the address among all Loki::FixedAllocator's.

      You're right that people should not inherit directly from Loki::AllocatorSingleton, and inherit from either SmallObject or SmallValueObject.

      Would you mind if I borrowed your code, adjusted it, added tests, and placed it in the Loki library?  If it works well, I'd like to provide it as a drop-in replacement for the standard allocator used with the STL containers.  Please let me know your thoughts on this matter.

      Sincerely,

      Rich

       
    • The allocator described earlier in this post is now part of Loki!  You can find it inside ./include/loki/Allocator.h.  Most of the code went in exactly as shown above, but I made some minor changes for readability, and added documentation comments.

      To use LokiAllocator, just declare your STL container as something akin to this:
      typedef ::std::vector< TestResult, ::Loki::LokiAllocator< TestResult > > Results;

      Cheers & Thank you!

      Rich

       
    • Steve Townsend
      Steve Townsend
      2008-09-11

      Is there an elegant way to make this STL-compliant allocator (for which many thanks) thread-safe on Windows?

       
    • Steve Townsend
      Steve Townsend
      2008-09-12

      btw this wrapper would be more efficient and (I believe) no less correct if the allocator singleton instance is cached by the constructor to avoid repeated calls to the Instance method.

       
  • I've made a small change to Loki's SmallObject allocator which makes a big difference in performance.  The change is in the FixedAllocator::Deallocate function.  Instead of searching through the Chunk list almost every time, it checks if either of its two pointers into the Chunk list have the address to be deallocated. This changes the performance from a linear search into a constant time operation most of the time.  Occasionally, it still has to search the list, but not as often so the overall performance is better.

    Rich