In my application I build a vector linearly, and later access it (semi-) randomly. I'm trying to find a way to build a vector that is efficient and overlaps I/O and computation. At the moment I get the data in lumps averaging around 256K (i.e. less than a block, but big enough not to worry about function call overheads), so I'm appending something like this:
vector_type::size_type oldSize = v.size();
v.resize(v.size() + n);
std::copy(incoming.begin(), incoming.end(), v.begin() + oldSize);
However, that doesn't seem to do that great a job: if I use a memory-backed disk, this code shifts about 1GB/s on a Sandy Bridge Core i7. Looking at the code, it seems that when this appends a new page, it evicts the old one synchronously.
Is there something like buf_ostream that will push data into a vector while overlapping I/O and computation? Is it worth learning how the block manager works and trying to do this directly on blocks, then sticking it all together with vector::set_content?
I didn't find anything that would write into a vector, but in the end I wrote a layer on top of stxxl::buffered_writer that allocates blocks directory, and used stxxl::vector::set_content at the end to construct the vector. That's a lot faster (around 4GB/s in the main thread), and seens to go a good job of overlapped I/O.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.