From: Kirk, B. (JSC-EG311) <ben...@na...> - 2011-06-16 16:36:24
|
Excellent. Any objections to including <cstring> and making it std::memset instead? On 6/16/11 11:07 AM, "John Peterson" <jwp...@gm...> wrote: > Hi, > > We were doing some profiling here recently and were surprised when > DenseMatrix::zero() showed up as 11% in a particular application! > > (You'll have to ask Derek why he was calling DenseMatrix::zero() so > many times ;-P) > > If you replace std::fill() in this method with a call to memset (which > is legal C++, from everything I've read, since vector is contiguous > and bytewise zero'ing POD types gives you actual zero for those types) > DenseMatrix::zero() goes back down into the noise. > > Some results on my workstation for doubles: memset is roughly 2 orders > of magnitude faster than std::fill when it is measurable. > > N Fill method Memset method > 100 1e-06s 1e-06s > 1000 1.2e-05s 1e-06s > 10000 0.00012s 3e-06s > 100000 0.001201s 5.2e-05s > 1000000 0.012151s 0.000625s > > Timings are slightly faster for int/float but the trend is the same. > > We've stuck with the std::fill method for std::complex for now... I > think in 99% of implementations memset should work there too, but I'm > not sure if the standard guarantees std::complex has to look > essentially like > > struct > { > T real; > T imag; > }; > > i.e. still a POD? > > > #include <iostream> > #include <vector> > #include <sys/time.h> // gettimeofday > > double elapsed(const timeval& tstart, const timeval& tstop) > { > return > static_cast<double>(tstop.tv_sec - tstart.tv_sec) + > static_cast<double>(tstop.tv_usec - tstart.tv_usec)*1.e-6; > } > > // Timings for fill vs. memset for zero'ing vectors of POD > int main() > { > // Timing objects > timeval tstart, tstop; > > // vector size > unsigned N=1000000; > typedef double pod_t; > > std::vector<pod_t> v(N); > > // Fill method > gettimeofday (&tstart, NULL); > std::fill(v.begin(), v.end(), 0); > gettimeofday (&tstop, NULL); > > std::cout << "Fill method: " << elapsed(tstart, tstop) << "s" << std::endl; > > // Memset method > gettimeofday (&tstart, NULL); > memset(&v[0], 0, sizeof(pod_t) * v.size()); > gettimeofday (&tstop, NULL); > > std::cout << "Memset method: " << elapsed(tstart, tstop) << "s" << > std::endl; > > return 0; > } > |