Can anyone help me with profiling questions ?
Also, if i have a 32-bit OS installed in 64 bit Intel Xeon machine, then which of the following is right code to read time stamp

static __inline__ unsigned long long rdtsc(void)
unsigned long long int x;
__asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
return x;

static __inline__ unsigned long long rdtsc(void)
unsigned hi, lo;
__asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
Essentially, my confusion is that since it is 64 bit machine, counters should be 64bit. If i use second implementation, i get inconsistent values of rdtsc counter.


On Fri, Feb 15, 2008 at 10:59 AM, john bryant <bryant.johan@gmail.com> wrote:
Can anyone help me with profiling questions ?

On Thu, Feb 14, 2008 at 5:48 PM, john bryant <bryant.johan@gmail.com> wrote:
Also, what is right way to read performance counters - rdmsr or rdpmc and why ?  Can someone explain the syntax of both.


On Thu, Feb 14, 2008 at 5:31 PM, john bryant <bryant.johan@gmail.com> wrote:
Can somebody provide me tutorial for profiling linux kernel. I am writing an application - part of which is to profile relevant parts of kernel for performance evaluation. I hope oprofile developers can answer my questions. My understanding of profiling is as follows:

1. Enable performance counter profiling in hardware
 (e.g, in Linux kernel we can do this calling with set_in_cr4(X86_CR4_PCE) )
2. Write to machine specific register (wrmsr): For example, wrmsr (MSR_P6_EVNTSEL0,event ,0). Here i have selected performance event selector of pentium 6. here is definition of wrmsr.

#define wrmsr(msr,val1,val2) \
        __asm__ __volatile__("wrmsr" \
                          : /* no outputs */ \
                          : "c" (msr), "a" (val1), "d" (val2))

Googling on "wrmsr", i found "val2" argument ( that is MSR) is mostly zero. Since val1, val2 are both inputs, what is significance of zero ? Also, How can I select event MSRs (val2) ( yes, i know i can do this with intel manual, is there a framework to select events ?)
Consider an example when i am trying to enable TLB counters and read it.
Now i have enabled a TLB counters, how can i read performance parameters (e.g TLB miss). Any tutorial for that ?