|
From: Joerg B. <jo...@we...> - 2003-07-28 09:35:11
|
Nicholas Nethercote wrote: > On Mon, 28 Jul 2003, Joerg Beyer wrote: > > >>I am aware that valgrind (with the calltree skin) counts >>the processor instructions, used to run some code. Is there >>also a way to count the CPU cycles? > > > No. > would it be good to have such a cycle counter? >>valgrind is a cool tool to do optimizations on the source >>(you could even do challenges: "you finds the fastest variant >>of this piece of code ?"). Now somebody tells me, that counting >>the instructions might be wrong, because not all instructions >>take the same amount of cycles. I assume, that this is right, >>but irrelevant from a practical point of view. > > > On the contrary, it's extremely relevant from a practical point of view. > It's almost impossible to know how long any one instruction will take. > For example, on my machine, a memory read takes 1 cycle if it hits the L1 > cache, 10 cycles in the worst case if it only hits the L2 cache, and 206 > cycles in the worst case if it misses both caches. But out-of-order > execution and all the other fancy modern CPU mumbo-jumbo means that these > delays are rarely as bad as the worst case. Then throw in branch > mispredictions, and other pipeline stalls... it's a mess. > > N OK, so an implementation to do cycle counting needs to have a table that lists for every instruction how many cycles it need for the different cache-hit/miss situations. Are these informations available (for all the processors)? Joerg |