On Thu, Jun 16, 2011 at 19:16, Roy Stogner <roystgnr@ices.utexas.edu> wrote:
The GNU STL implementation is a twisty maze of passages, all alike, so
I may have missed something, but the only obvious specializations I
see are the memset-based char one and the vector<bool> implementation.

Why don't you disassemble the function? It should either be inlined to produce a tight loop or a call to a memzero function.

Note that some compilers will replace a naive loop to zero or copy memory with a call to an internal memset or memcpy function. The Intel compiler has done this for a while. I find it slightly amusing that they made that "optimization" long before making the compiler generate decent code for STREAM (it was horrible until version 11).

Also note that you can pretty reliably beat the compiler and library implementations if you know what level the memory is going to be at and/or how long until you reuse it. But this sort of optimization is probably not what you want to spend time rolling yourself.