Re: [perfmon2] FLOPS on Nehalem
Status: Beta
Brought to you by:
seranian
From: Dr. V. K. <Vin...@sc...> - 2009-10-07 17:30:17
|
Dear Stéphane, stephane eranian a écrit : > Vincent, > > Hugh is right! > Be careful than on Core i7, micro-ops are counted not instructions. OK. > Other users have also reported variations in the number of > micro-ops reported for the same instruction. It depends on > the floating point values passed and whether or not they > reach the limit of their types (e.g., denormals). OK, I modify the module accordingly to what Hugh said and it seems to be quite OK (at least "enough" for a first prototype): MXM: Measured : Perf : 4.818 [GFLOPS] Computed : Perf : 4.383 [GFLOPS] MXV: Measured : Perf : 1.329 [GFLOPS] Computed : Perf : 1.172 [GFLOPS] (the regular 10 % can be easily used as a correction) > As for PFM_NHM_SEL_ANYTHR, it is not mandatory at all. > In fact you probably don't want to use it. If you run pfmon > on all logical cores (without --cpu-list), then you can compute > total FLOPS by adding up each per-cpu counts. Alternatively > you can use the --aggr option to have pfmon do it for you. Heum, maybe my first post was not clear, sorry for that. I don't use the "pfmon" tool. I implemented what I needed. My goal is to be able to measure all the FLOP's (actual) on each (physical) core of a processor, regardless of the application/thread/process which is running on. It is why I thought I had to precise the PFM_NHM_SEL_ANYTHR flag in the structure. Maybe I must have a look in the pfmon source code ? Thanks again for your help Cheers :) Vince -- --------------------------------------------------- Dr. Vincent KELLER Fraunhofer-Institut für Algorithmen und Wissenschaftliches Rechnen SCAI http://scai.fraunhofer.de ADDRESS: Schloss Birlinghoven D - 53754 Sankt Augustin Germany PHONE : + 49 (0) 2241/14-2280 FAX : + 49 (0) 2241/14-2258 E-MAIL : Vin...@sc... --------------------------------------------------- |