Re: Oprofile Query | How to derive relation between “total samples collected” and “sampling rate” wi

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On 10/09/2012 01:34 PM, Purohit Amit-B09332 wrote:
> Hello Maynard Johnson,
> 
> Thank you for your reply.
> 
> [AmitP] Yes, 10686 is the total number of samples collected by oprofile (ran for 10secs) for the whole system (this covers the application, vmlinux, libc, librt, libpthread etc. running on the system) This experiment is done on a single processor running at 1GHz. To give you an idea, the system is fully active as the application itself consuming almost ~50% of CPU, vmlinux taking approx. 30% CPU and other stuff taking another 12-13% CPU. The cpu_idle shown is close to 7-8% (when i am stating the CPU %, its the relative % pf each system component based on the cummulative samples hit for that component divided by total samples collected).
So are you saying these percentages are calculated from the number of samples oprofile is collected?  You can't necessarily use those sample numbers for the purpose of determining how busy your system is.  Below, you use the term "full active system", but unless you can verify that your application is never waiting for I/O (or otherwise stalled), you cannot claim that your application keeps the processor 100% busy.  Note that oprofile does *not* collect samples on a process that's been switched out while it's waiting on I/O.
> 
> As stated, the event used was CPU_CLK and sampling rate configured to 1000000. My questions are, 
> (a) In a full active system, if i reduce the oprofile sampling rate from 1000000 to 500000 (i.e. half), should i expect the total number of samples for the whole system to increase orart it should remain the same.
First of all, don't confuse sampling rate with the reset value (aka "count" value specified with your event specification).  Sampling rate is the number of samples per second, which is inversely proportional to the reset value.  So, what is the 1000000 (1 million) value you're referring to above?  You say below this is the default sampling rate.  This is neither the default sampling rate nor the default reset value.  In the current version of oprofile, the default reset value for CPU_CLK is 100000 (one hundred thousand), which (as I said in my earlier response) works out to a sampling rate of 10,000 samples per second.

But let's assume that the 1000000 value you call "sampling rate" is really the reset value (from the context of your questions, I believe that's a good assumption).  Then it's a true statement that as this value is decreased, the number of samples should increase (as I mentioned above).

> (b) In a full active system, if the duration between opcontrol --start and opcontrol --stop commands i.e. the duration for which oprofile samples have been captured is increased from 10sec to 100secs, should i expect a linear increase in total number of samples collected?
No -- for reasons I've already stated in my previous email.

-Maynard
> 
> Refer to these 2 results:
> Event = CPU_CLK
>  (1)  With Oprofile Sampling Rate = 1000000 (default) and oprofile test duration of ~10secs:
>          The total samples collected by Oprofile sampler = 10686.
> 
>  (2) With Oprofile Sampling Rate = 1000000 (default) and oprofile test duration of ~100secs:
>        The total samples collected by oprofile sampler = 28121.
> 
> As you can see that if i keep oprofile sampling rate same but increase the duration for which oprofile samples have been captured then total samples collected are not linearly increased.
> 
> Regards,
> Amit.
> ________________________________________
> From: Maynard Johnson [may...@us...]
> Sent: Tuesday, October 09, 2012 8:42 AM
> To: Purohit Amit-B09332
> Cc: oprofile-list
> Subject: Re: Oprofile Query | How to derive relation between “total samples collected” and “sampling rate” with the “oprofile test run duration”
> 
> On 10/09/2012 06:43 AM, Purohit Amit-B09332 wrote:
>> Greetings!!!!
>>
>>
>>
>> I have a query on Oprofile sampling rate for the event “CPU_CLK” on a powerpc core running at 1GHz core clock. Please help me here.
>>
>>
>>
>> We did following Oprofile experiment by varying Oprofile test run duration and varying CPU_CLK sampling rate:
>>
>>
>>
>> a)      Oprofile test run duration for:
>>
>> ü  10 secs and
>>
>> ü  100secs.
>>
>> b)      CPU_CLK sampling rate variation and tried following sampling rates:
>>
>> ·         1000us (default)
>                 ^-- If this is supposed to mean the default sampling interval is 1,000 microseconds, that's not correct.  The default count (i.e., reset value) for CPU_CLK is 100,000, which means you get one sample for every 100,000 CPU_CLK events.  Since your clock rate is 1Ghz, this works out to 10,000 samples per second which is a sampling interval of 0.1 millisecond (i.e., 100us).  So if my math is correct, it seems you're off by a factor of 10.
>>
>> ·         1100us
>>
>> ·         500us
>>
>> ·         250us
>>
>> ·         50us
>>
>>
>>
>> But there are some observations where we want your help:
>>
>>
>>
>> All the data referred below is generated for a IP traffic of around 90MBPS for an Evolved Packet System scenario:
>>
>>
>>
>> ·         Reference Data with default sampling rate (1000us) and default oprofile test run duration (i.e. ~10sec).
>>
>> Ø  With Oprofile Sampling Rate = 1000us (default) and oprofile test duration of ~10secs:
>>
>> o   The total samples collected by Oprofile sampler = 10686.
> Is this the total samples for the *whole* system?  Or just your kernel driver handling the IP traffic?  How many processors does the system have?  Theoretically, the total samples for a "raw cycles" event should be:
>         total samples = samples/sec * run-time * number-of-processors
> And then . . .
>         total cycles events = total samples * reset-value
> 
> But the actual total samples depends on how the cycles event is counted (i.e., is it counted only when the process is in the run state?) and also how active your system is.  If the system is basically quiesced, the kernel will be doing just minimal housekeeping, so you won't see many samples.  If you have a multi-processor system that is quiesced, it's likely that some processors will be completely inactive, and zero samples would be collected for them.
> 
> So, in answer to your question below of how to relate “total samples collected” and “sampling rate” with the “oprofile test run duration” . . . it's complicated, and there's no easy formula.  But if you focus on one application that is 100% busy during the profiling time, then you should get pretty close to the following:
>         total samples for application = samples/sec * run-time
> 
> Hope this helps.
> 
> -Maynard
>>
>>
>>
>> ·         Experiment Data with varying sampling rate and oprofile test run duration increased (i.e. ~100sec).
>>
>> Ø  With Oprofile Sampling Rate = 1000us (default) and oprofile test duration of ~100secs:
>>
>> o   The total samples collected by oprofile sampler = 28121.
>>
>> Ø  With Oprofile Sampling Rate = 1100us and oprofile test duration of ~100secs:
>>
>> o   The total samples collected by oprofile sampler = 26313.
>>
>> Ø  With Oprofile Sampling Rate = 500us and oprofile test duration of ~100secs:
>>
>> o   The total samples collected by oprofile sampler = 25463.
>>
>> Ø  With Oprofile Sampling Rate = 250us and oprofile test duration of ~100secs:
>>
>> o   The total samples collected by oprofile sampler = 26858.
>>
>> Ø  With Oprofile Sampling Rate = 50us and oprofile test duration of ~100secs:
>>
>> o   The total samples collected by oprofile sampler = 22759.
>>
>>
>>
>> The doubt where we need your help is following:
>>
>>
>>
>> ·         How do we relate “total samples collected” and “sampling rate” with the “oprofile test run duration”?
>>
>> o   I am giving below some data based on the oprofile results:
>>
>> 1.       The total samples collected by oprofiler sampler with 1000us (default) sampling rate (running for ~10sec) = 10686.
>>
>> ·         10686 * 1000us = 10686000us = 10.686 secs. (This calculation approx matches with the oprofile test run duration i.e. 10sec)
>>
>> 2.       The total samples collected by oprofiler sampler with 50us sampling rate (running for ~100sec) = 22759.
>>
>> ·         22759 * 50us = 1137950us = 1.13795 secs. (This doesn’t match (with the same calculation done in step#1) with oprofile test run duration i.e. 100sec).
>>
>>
>>
>> Thanks & Regards,
>>
>> Amit.
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Don't let slow site performance ruin your business. Deploy New Relic APM
>> Deploy New Relic app performance management and know exactly
>> what is happening inside your Ruby, Python, PHP, Java, and .NET app
>> Try New Relic at no cost today and get our sweet Data Nerd shirt too!
>> http://p.sf.net/sfu/newrelic-dev2dev
>>
>>
>>
>> _______________________________________________
>> oprofile-list mailing list
>> opr...@li...
>> https://lists.sourceforge.net/lists/listinfo/oprofile-list
> 
> 
>