Thanks for your remind, I forgot to write the details. The commands I use were as follows:
1.sudo opcontrol --setup --vmlinux=/home/ssg/vmlinux --separate=lib,thread,kernel --event=CPU_CLK_UNHALTED:100000 --callgraph=10
2.sudo opcontrol --reset
3.sudo opcontrol --start
4.run my_program.
5.sudo opcontrol --stop
6.opreport -l --callgraph=10 --merge=tgid ./my_program | less
(I use the legacy mode instead of the operf due to I use many signal event in my program. And I tried use the operf, it doesn't work. So I can only use
the legacy mode to profile.)
The result I get like follows:
samples  %        image name               symbol name
-------------------------------------------------------------------------------
  133       0.3858  wc                       mr_worker
  34345    99.6142  wc                       out_cmp
35263    20.9390  libc-2.15.so             __memset_sse2
  35263    100.000  libc-2.15.so             __memset_sse2 [self]
according to the online manual, it means out_cmp function calls memset functions. But my out_cmp function is just strcmp. there is no memset functions at all, So I think it is strange. So what do you think? Any help would be appreciated. Thanks a lot~
Best
Emily





2013/9/25 Maynard Johnson <maynardj@us.ibm.com>
On 09/25/2013 01:46 AM, benzhi cao wrote:
> Thanks a lot, it's very helpful to me.
> What's more, when I profile with oprofile, I can not know which
> function call the glibc functions like memmove,(I have already used
> the --callgraph options, but still no result). Do you know how to do
> that? Thanks~
Please be specific by telling us the commands you're using, the results you get, and what you think is wrong.  The callgraph option works *mostly*.  There are some corner cases mis-handled.  For example, see http://oprofile.sourceforge.net/doc/interpreting-callgraph.html.

-Maynard
> Best
> Emily
>
>
> 2013/9/23, Maynard Johnson <maynardj@us.ibm.com>:
>> On 09/21/2013 05:46 AM, benzhi cao wrote:
>>> Thanks for your reply. But now I have another questions. When I use 32
>>> threads to run my app, and use the opreport to show the results, the
>>> results were mess, and I cann't see the results easily.
>>> Do you know how to see the results clearly?
>> I mentioned in my first response that this likely would be the case.  Did
>> you try the tips I suggested?  Here's an example of what I was trying to
>> say:
>>
>> If I use 'operf --separate-thread' to profile a Java 1.6 app, doing
>> 'opreport' with no options shows the following jumbled mess:
>>
>> [mpjohn@oc1757000783 myJavaStuff]$ opreport
>> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples
>> directory.
>> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
>> mask of 0x00 (No unit mask) count 100000
>> Processes with a thread ID of 21373
>> Processes with a thread ID of 21376
>> Processes with a thread ID of 21378
>> Processes with a thread ID of 21379
>> Processes with a thread ID of 21380
>> Processes with a thread ID of 21382
>> Processes with a thread ID of 21383
>> Processes with a thread ID of 21384
>> Processes with a thread ID of 21385
>> Processes with a thread ID of 21386
>> Processes with a thread ID of 21387
>> Processes with a thread ID of 21388
>> Processes with a thread ID of 21389
>> Processes with a thread ID of 21390
>> Processes with a thread ID of 21391
>>         tid:21373|        tid:21376|        tid:21378|        tid:21379|
>>    tid:21380|        tid:21382|        tid:21383|        tid:21384|
>> tid:21385|        tid:21386|        tid:21387|        tid:21388|
>> tid:21389|        tid:21390|        tid:21391|
>>   samples|      %|  samples|      %|  samples|      %|  samples|      %|
>> samples|      %|  samples|      %|  samples|      %|  samples|      %|
>> samples|      %|  samples|      %|  samples|      %|  samples|      %|
>> samples|      %|  samples|      %|  samples|      %|
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>        82 100.000      3763 100.000      2763 100.000        91 100.000
>>    1 100.000         3 100.000        43 100.000         3 100.000         2
>> 100.000         1 100.000         6 100.000         2 100.000       163
>> 100.000         2 100.000    109761 100.000 java
>>
>>  . . . . blah, blah
>>
>> =======================================
>>
>> It's practically impossible to read such a report manually.  The easiest
>> thing for you to do is to pick individual processes (or threads) to focus
>> on, one at a time; for example:
>>
>> Focusing on the first process using 'opreport tgid:21373' I can get the
>> exact same jumbled mess, showing all the individual thread IDs.  Notice the
>> profile specification of "tgid:21373'.  The "tgid" is "thread group ID",
>> which basically means you're asking opreport to show you all data for that
>> process and its child threads. Since this is Java 1.6, I happen to know that
>> the JVM creates threads to do its work (versus fork/exec which would create
>> new child *processes*).  So I then randomly choose one of the other threads
>> in the list above and use "tid" in the profile specification to see profile
>> data for that thread:
>>
>> opreport tid:21378
>> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples
>> directory.
>> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
>> mask of 0x00 (No unit mask) count 100000
>> CPU_CLK_UNHALT...|
>>   samples|      %|
>> ------------------
>>      2763 100.000 java
>>      CPU_CLK_UNHALT...|
>>        samples|      %|
>>      ------------------
>>           2554 92.4358 libj9jit24.so
>>             68  2.4611 no-vmlinux
>>             50  1.8096 libc-2.12.so
>>             27  0.9772 libj9vm24.so
>>             24  0.8686 libj9thr24.so
>>             16  0.5791 libj9prt24.so
>>             12  0.4343 libpthread-2.12.so
>>              5  0.1810 libj9hookable24.so
>>
>> ======================
>>
>> Hope that helps.
>>
>> -Maynard
>>
>>
>>
>>
>>
>>
>>> Best~
>>> Emily
>>>
>>>
>>> 2013/9/18 Michael Petlan <mpetlan@redhat.com <mailto:mpetlan@redhat.com>>
>>>
>>>     Hi,
>>>
>>>     As I know, the L2_LINES_IN can be used for that, see this reference
>>> guide:
>>>
>>> http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html
>>> <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html>
>>>
>>>     The LLC_MISSES should care about the last level cache, it may be the
>>> L3.
>>>
>>>     I have L2_CACHE_MISS event for this, but maybe you haven't.
>>>
>>>     Please take it as a non-official information.
>>>
>>>     Regards,
>>>     Michael
>>>
>>>
>>>
>>>     -------- Original message --------
>>>     Předmět: Re: using oprofile to debug multi-processes programs on
>>> linux
>>>     Datum: Mon, 16 Sep 2013 08:34:03 -0500
>>>     Od: Maynard Johnson <maynardj@us.ibm.com
>>> <mailto:maynardj@us.ibm.com>>
>>>     Komu: benzhi cao <caobenzhi0915@gmail.com
>>> <mailto:caobenzhi0915@gmail.com>>
>>>     Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net
>>> <mailto:oprofile-list@lists.sourceforge.net>>
>>>
>>>     On 09/14/2013 08:24 PM, benzhi cao wrote:
>>>
>>>         Thanks so much for your reply. And I can collect the information
>>> for every process now.
>>>         Also I want to collect the L2 cache miss, so I try to use ophelp
>>> to find the event that
>>>         I can use for L2 cache miss. And I think the event is LLC_MISSES.
>>> But I also find
>>>         some guys who use l2_lines_in to profile l2 cache miss, so I was
>>> confused, I don't know
>>>         which is the right event? What's more, my hardware is intel
>>> architecture 64.
>>>         Best~
>>>         Emily
>>>
>>>     Adding oprofile-list back to cc so that maybe someone else on the list
>>> can help, since Intel is not my primary architecture of expertise.
>>>
>>>     -Maynard
>>>
>>>
>>>
>>>         2013/9/14 Maynard Johnson <maynardj@us.ibm.com
>>> <mailto:maynardj@us.ibm.com> <mailto:maynardj@us.ibm.com
>>> <mailto:maynardj@us.ibm.com>>>
>>>
>>>             On 09/13/2013 02:18 AM, benzhi cao wrote:
>>>             > Hi, can Oprofile be  used to profile performance of
>>> multi-processes programs ? And if it can, how to see the the performance
>>> of each process? (P.S: The online manual shows that it can be used to
>>> profile multi-threads programs, but I don't know whether it  can be used
>>> for multi-processes). Any help will be appreciated, thanks a lot~
>>>             Hi, Emily,
>>>             Hopefully, you're using oprofile 0.9.9 so you can use operf
>>> instead of the older "legacy" opcontrol commands.  Using operf, you can
>>> specify to profile just the particular application (or process) you're
>>> interested in. If your application does fork/exec to create new child
>>> processes, operf will, by default, collect all sample data for the parent
>>> and children, but will aggregate all sample data. (ATTENTION:  0.9.9 has
>>> some key bug fixes for operf relating to following forked children.)  You
>>> can specify "--separate-thread" (see operf's man page for details) so that
>>> samples are separated by process and thread.  If you do collect a
>>> --separate-thread profile, be aware that opreport, being a text-based
>>> report generator does not handle too many axes of separation very well.
>>> You may get a report that looks like a jumbled mess, but would show a list
>>> of process IDs near the top of the report.  You could use that list of
>>> PIDs to generate per-process reports -- e.g.,
>>>         'opreport tgi!
>>>
>>>      d:<pid!
>>>
>>>              #> [option
>>>             s]'.  In some cases, opreport gives up and tells you that you
>>> have to either provide a profile specification (e.g., 'tgid:<pid#">' or,
>>> if profiling with multiple events, 'event:<event_name>').  More
>>> information on profile specifications can be found at
>>> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec
>>> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>.
>>>
>>>             -Maynard
>>>
>>>
>>>             > Best~
>>>             > Emily
>>>             >
>>>             >
>>>             >
>>>             >
>>> ------------------------------__------------------------------__------------------
>>>             > How ServiceNow helps IT people transform IT departments:
>>>             > 1. Consolidate legacy IT systems to a single system of
>>> record for IT
>>>             > 2. Standardize and globalize service processes across IT
>>>             > 3. Implement zero-touch automation to replace manual,
>>> redundant tasks
>>>             >
>>> http://pubads.g.doubleclick.__net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk
>>> <http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk>
>>>             >
>>>             >
>>>             >
>>>             > _________________________________________________
>>>             > oprofile-list mailing list
>>>             > oprofile-list@lists.__sourceforge.net
>>> <mailto:oprofile-list@lists.sourceforge.net>
>>> <mailto:oprofile-list@lists.__sourceforge.net
>>> <mailto:oprofile-list@lists.sourceforge.net>>
>>>             > https://lists.sourceforge.net/__lists/listinfo/oprofile-list
>>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list>
>>>             >
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------__------------------------------__------------------
>>>     LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
>>>     1,500+ hours of tutorials including VisualStudio 2012, Windows 8,
>>> SharePoint
>>>     2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack
>>> includes
>>>     Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
>>>
>>> http://pubads.g.doubleclick.__net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk
>>> <http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk>
>>>
>>>     _________________________________________________
>>>     oprofile-list mailing list
>>>     oprofile-list@lists.__sourceforge.net
>>> <mailto:oprofile-list@lists.sourceforge.net>
>>>     https://lists.sourceforge.net/__lists/listinfo/oprofile-list
>>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
>>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8,
>>> SharePoint
>>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack
>>> includes
>>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
>>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
>>>
>>>
>>>
>>> _______________________________________________
>>> oprofile-list mailing list
>>> oprofile-list@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/oprofile-list
>>>
>>
>>
>