map hangs during insert

2008-03-11
2013-04-25
  • Nobody/Anonymous

    Hi,

    I'm doing some testing with stxxl-1.1.0 on Linux.
    Distro: RHEL AS4
    Kernel: 2.6.9-42.ELsmp
    gcc: 3.4.6

    I built STXXL according to the instructions on
    http://algo2.iti.uni-karlsruhe.de/dementiev/stxxl/doxy/html/installation_linux_g++.html

    Boost is not being used on our system.

    The STXXL cache file size is 200M. We're not using raw disk. I used the createdisks.stxxl.bin utility to create the cache file as an ordinary file on the machine's local filesystem.

    I created an stxxl::map as follows:

    struct IntCmp
    {
      static int min_value() { return INT_MIN; }
      static int max_value() { return INT_MAX; }
      bool operator()( const int& lhs, const int& rhs ) const { return lhs < rhs; }
    };

    stxxl::map< int, int, IntCmp > testMap( 10 * 1024 * 1024, 10 * 1024 * 1024 );

    The key and value type is integer, and the node and leaf cache is 10 M each.
    I ran some tests to see how much data can be stored in the map. Inserting upto 1 million elements works fine. Then the inserts become very slow. After adding another 100k elements to the map, the program just hangs.

    If I increase the node and leaf cache size to 20M, the insert works upto 2 million elements. Then it starts hanging again. Here's the stack trace when the program is stuck:

    Thread 2 (Thread -1219314768 (LWP 947)):
    #0  0x001aa7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
    #1  0x0037c41b in __write_nocancel () from /lib/tls/libpthread.so.0
    #2  0x080700f5 in stxxl::syscall_request::serve ()
    #3  0x08076708 in stxxl::disk_queue::worker ()
    #4  0x00377371 in start_thread () from /lib/tls/libpthread.so.0
    #5  0x0028fffe in clone () from /lib/tls/libc.so.6
    Thread 1 (Thread -1208169824 (LWP 892)):
    #0  0x001aa7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
    #1  0x00379b26 in pthread_cond_wait@@GLIBC_2.3.2 ()
    #2  0x0806def7 in stxxl::state::wait_for ()
    #3  0x0806cff8 in stxxl::ufs_request_base::wait ()
    #4  0x0804f4c4 in stxxl::btree::normal_leaf<int, int, IntCmp, 131072u, stxxl::btree::btree<int, int, IntCmp, 16384u, 131072u, stxxl::SR> >::save ()
    #5  0x0805145a in stxxl::btree::node_cache<stxxl::btree::normal_leaf<int, int, IntCmp, 131072u, stxxl::btree::btree<int, int, IntCmp, 16384u, 131072u, stxxl::SR> >, stxxl::btree::btree<int, int, IntCmp, 16384u, 131072u, stxxl::SR> >::get_node ()
    #6  0x08050b41 in stxxl::btree::btree<int, int, IntCmp, 16384u, 131072u, stxxl::SR>::insert ()
    #7  0x0804c2c4 in stxxl::btree::btree<int, int, IntCmp, 16384u, 131072u, stxxl::SR>::operator[] ()
    #8  0x0804b1f4 in stxxl::map<int, int, IntCmp, 16384u, 131072u, stxxl::SR>::operator[] ()
    #9  0x0804ab26 in main ()

    Has anyone else seen this problem. It doesn't happen with other data structures such as stxxl::vector. I created a vector with 40m integers and it worked fine. It is only happening with stxxl::map.

    -Rahul Sood
    rsood22@yahoo.com

     
    • Nobody/Anonymous

      Hi Rahul,

      You program looks like hanging because the map data structure has to do a lot of I/O (the cache is too small). See also my previous post, you should increase the cache sizes in order to speedup your program.

      The stxxl::vector writes the elements sequentially spending only Theta(1/B) I/Os for an insertion. For a B-tree (the basis of stxxl::map) the I/O complexity of an insertion is Theta(round_up(log_B (N))) in the worst case which is much more. In the case of insertion of random keys only caching can help to reduce the I/O load.

      Roman

       

Log in to post a comment.