Efficiently appending to a vector

  • Bruce Merry

    Bruce Merry - 2012-11-21


    In my application I build a vector linearly, and later access it (semi-) randomly. I'm trying to find a way to build a vector that is efficient and overlaps I/O and computation. At the moment I get the data in lumps averaging around 256K (i.e. less than a block, but big enough not to worry about function call overheads), so I'm appending something like this:

    vector_type::size_type oldSize = v.size();
    v.resize(v.size() + n);
    std::copy(incoming.begin(), incoming.end(), v.begin() + oldSize);

    However, that doesn't seem to do that great a job: if I use a memory-backed disk, this code shifts about 1GB/s on a Sandy Bridge Core i7. Looking at the code, it seems that when this appends a new page, it evicts the old one synchronously.

    Is there something like buf_ostream that will push data into a vector while overlapping I/O and computation? Is it worth learning how the block manager works and trying to do this directly on blocks, then sticking it all together with vector::set_content?

  • Bruce Merry

    Bruce Merry - 2012-11-22

    I didn't find anything that would write into a vector, but in the end I wrote a layer on top of stxxl::buffered_writer that allocates blocks directory, and used stxxl::vector::set_content at the end to construct the vector. That's a lot faster (around 4GB/s in the main thread), and seens to go a good job of overlapped I/O.


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks