From: benzhi c. <cao...@gm...> - 2013-09-13 07:18:22
|
Hi, can Oprofile be used to profile performance of multi-processes programs ? And if it can, how to see the the performance of each process? (P.S: The online manual shows that it can be used to profile multi-threads programs, but I don't know whether it can be used for multi-processes). Any help will be appreciated, thanks a lot~ Best~ Emily |
From: Maynard J. <may...@us...> - 2013-09-13 19:17:23
|
On 09/13/2013 02:18 AM, benzhi cao wrote: > Hi, can Oprofile be used to profile performance of multi-processes programs ? And if it can, how to see the the performance of each process? (P.S: The online manual shows that it can be used to profile multi-threads programs, but I don't know whether it can be used for multi-processes). Any help will be appreciated, thanks a lot~ Hi, Emily, Hopefully, you're using oprofile 0.9.9 so you can use operf instead of the older "legacy" opcontrol commands. Using operf, you can specify to profile just the particular application (or process) you're interested in. If your application does fork/exec to create new child processes, operf will, by default, collect all sample data for the parent and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has some key bug fixes for operf relating to following forked children.) You can specify "--separate-thread" (see operf's man page for details) so that samples are separated by process and thread. If you do collect a --separate-thread profile, be aware that opreport, being a text-based report generator does not handle too many axes of separation very well. You may get a report that looks like a jumbled mess, but would show a list of process IDs near the top of the report. You could use that list of PIDs to generate per-process reports -- e.g., 'opreport tgid:<pid! #> [option s]'. In some cases, opreport gives up and tells you that you have to either provide a profile specification (e.g., 'tgid:<pid#">' or, if profiling with multiple events, 'event:<event_name>'). More information on profile specifications can be found at http://oprofile.sourceforge.net/doc/results.html#profile-spec. -Maynard > Best~ > Emily > > > > ------------------------------------------------------------------------------ > How ServiceNow helps IT people transform IT departments: > 1. Consolidate legacy IT systems to a single system of record for IT > 2. Standardize and globalize service processes across IT > 3. Implement zero-touch automation to replace manual, redundant tasks > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk > > > > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Maynard J. <may...@us...> - 2013-09-16 13:35:17
|
On 09/14/2013 08:24 PM, benzhi cao wrote: > Thanks so much for your reply. And I can collect the information for every process now. > Also I want to collect the L2 cache miss, so I try to use ophelp to find the event that > I can use for L2 cache miss. And I think the event is LLC_MISSES. But I also find > some guys who use l2_lines_in to profile l2 cache miss, so I was confused, I don't know > which is the right event? What's more, my hardware is intel architecture 64. > Best~ > Emily Adding oprofile-list back to cc so that maybe someone else on the list can help, since Intel is not my primary architecture of expertise. -Maynard > > > 2013/9/14 Maynard Johnson <may...@us... <mailto:may...@us...>> > > On 09/13/2013 02:18 AM, benzhi cao wrote: > > Hi, can Oprofile be used to profile performance of multi-processes programs ? And if it can, how to see the the performance of each process? (P.S: The online manual shows that it can be used to profile multi-threads programs, but I don't know whether it can be used for multi-processes). Any help will be appreciated, thanks a lot~ > Hi, Emily, > Hopefully, you're using oprofile 0.9.9 so you can use operf instead of the older "legacy" opcontrol commands. Using operf, you can specify to profile just the particular application (or process) you're interested in. If your application does fork/exec to create new child processes, operf will, by default, collect all sample data for the parent and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has some key bug fixes for operf relating to following forked children.) You can specify "--separate-thread" (see operf's man page for details) so that samples are separated by process and thread. If you do collect a --separate-thread profile, be aware that opreport, being a text-based report generator does not handle too many axes of separation very well. You may get a report that looks like a jumbled mess, but would show a list of process IDs near the top of the report. You could use that list of PIDs to generate per-process reports -- e.g., 'opreport tgi! d:<pid! > #> [option > s]'. In some cases, opreport gives up and tells you that you have to either provide a profile specification (e.g., 'tgid:<pid#">' or, if profiling with multiple events, 'event:<event_name>'). More information on profile specifications can be found at http://oprofile.sourceforge.net/doc/results.html#profile-spec. > > -Maynard > > > > Best~ > > Emily > > > > > > > > ------------------------------------------------------------------------------ > > How ServiceNow helps IT people transform IT departments: > > 1. Consolidate legacy IT systems to a single system of record for IT > > 2. Standardize and globalize service processes across IT > > 3. Implement zero-touch automation to replace manual, redundant tasks > > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk > > > > > > > > _______________________________________________ > > oprofile-list mailing list > > opr...@li... <mailto:opr...@li...> > > https://lists.sourceforge.net/lists/listinfo/oprofile-list > > > > |
From: benzhi c. <cao...@gm...> - 2013-09-21 10:46:52
|
Thanks for your reply. But now I have another questions. When I use 32 threads to run my app, and use the opreport to show the results, the results were mess, and I cann't see the results easily. Do you know how to see the results clearly? Best~ Emily 2013/9/18 Michael Petlan <mp...@re...> > Hi, > > As I know, the L2_LINES_IN can be used for that, see this reference guide: > http://software.intel.com/**sites/products/documentation/** > doclib/stdxe/2013/amplifierxe/**win/win_reference/pmp/events/** > about_l2_cache_events.html<http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html> > > The LLC_MISSES should care about the last level cache, it may be the L3. > > I have L2_CACHE_MISS event for this, but maybe you haven't. > > Please take it as a non-official information. > > Regards, > Michael > > > > -------- Original message -------- > Předmět: Re: using oprofile to debug multi-processes programs on linux > Datum: Mon, 16 Sep 2013 08:34:03 -0500 > Od: Maynard Johnson <may...@us...> > Komu: benzhi cao <cao...@gm...> > Kopie: oprofile-list <oprofile-list@lists.**sourceforge.net<opr...@li...> > > > > On 09/14/2013 08:24 PM, benzhi cao wrote: > >> Thanks so much for your reply. And I can collect the information for >> every process now. >> Also I want to collect the L2 cache miss, so I try to use ophelp to find >> the event that >> I can use for L2 cache miss. And I think the event is LLC_MISSES. But I >> also find >> some guys who use l2_lines_in to profile l2 cache miss, so I was >> confused, I don't know >> which is the right event? What's more, my hardware is intel architecture >> 64. >> Best~ >> Emily >> > Adding oprofile-list back to cc so that maybe someone else on the list can > help, since Intel is not my primary architecture of expertise. > > -Maynard > >> >> >> 2013/9/14 Maynard Johnson <may...@us... <mailto: >> may...@us...>> >> >> On 09/13/2013 02:18 AM, benzhi cao wrote: >> > Hi, can Oprofile be used to profile performance of multi-processes >> programs ? And if it can, how to see the the performance of each process? >> (P.S: The online manual shows that it can be used to profile multi-threads >> programs, but I don't know whether it can be used for multi-processes). >> Any help will be appreciated, thanks a lot~ >> Hi, Emily, >> Hopefully, you're using oprofile 0.9.9 so you can use operf instead >> of the older "legacy" opcontrol commands. Using operf, you can specify to >> profile just the particular application (or process) you're interested in. >> If your application does fork/exec to create new child processes, operf >> will, by default, collect all sample data for the parent and children, but >> will aggregate all sample data. (ATTENTION: 0.9.9 has some key bug fixes >> for operf relating to following forked children.) You can specify >> "--separate-thread" (see operf's man page for details) so that samples are >> separated by process and thread. If you do collect a --separate-thread >> profile, be aware that opreport, being a text-based report generator does >> not handle too many axes of separation very well. You may get a report >> that looks like a jumbled mess, but would show a list of process IDs near >> the top of the report. You could use that list of PIDs to generate >> per-process reports -- e.g., 'opreport tgi! >> > d:<pid! > >> #> [option >> s]'. In some cases, opreport gives up and tells you that you have to >> either provide a profile specification (e.g., 'tgid:<pid#">' or, if >> profiling with multiple events, 'event:<event_name>'). More information on >> profile specifications can be found at http://oprofile.sourceforge.** >> net/doc/results.html#profile-**spec<http://oprofile.sourceforge.net/doc/results.html#profile-spec> >> . >> >> -Maynard >> >> >> > Best~ >> > Emily >> > >> > >> > >> > ------------------------------**------------------------------** >> ------------------ >> > How ServiceNow helps IT people transform IT departments: >> > 1. Consolidate legacy IT systems to a single system of record for IT >> > 2. Standardize and globalize service processes across IT >> > 3. Implement zero-touch automation to replace manual, redundant >> tasks >> > http://pubads.g.doubleclick.**net/gampad/clk?id=51271111&iu=** >> /4140/ostg.clktrk<http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk> >> > >> > >> > >> > ______________________________**_________________ >> > oprofile-list mailing list >> > oprofile-list@lists.**sourceforge.net<opr...@li...><mailto: >> oprofile-list@lists.**sourceforge.net<opr...@li...> >> > >> > https://lists.sourceforge.net/**lists/listinfo/oprofile-list<https://lists.sourceforge.net/lists/listinfo/oprofile-list> >> > >> >> >> > > ------------------------------**------------------------------** > ------------------ > LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > SharePoint > 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack > includes > Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > http://pubads.g.doubleclick.**net/gampad/clk?id=58041151&iu=** > /4140/ostg.clktrk<http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk> > > ______________________________**_________________ > oprofile-list mailing list > oprofile-list@lists.**sourceforge.net<opr...@li...> > https://lists.sourceforge.net/**lists/listinfo/oprofile-list<https://lists.sourceforge.net/lists/listinfo/oprofile-list> > > > |
From: Maynard J. <may...@us...> - 2013-09-23 15:16:53
|
On 09/21/2013 05:46 AM, benzhi cao wrote: > Thanks for your reply. But now I have another questions. When I use 32 threads to run my app, and use the opreport to show the results, the results were mess, and I cann't see the results easily. > Do you know how to see the results clearly? I mentioned in my first response that this likely would be the case. Did you try the tips I suggested? Here's an example of what I was trying to say: If I use 'operf --separate-thread' to profile a Java 1.6 app, doing 'opreport' with no options shows the following jumbled mess: [mpjohn@oc1757000783 myJavaStuff]$ opreport Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples directory. CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 Processes with a thread ID of 21373 Processes with a thread ID of 21376 Processes with a thread ID of 21378 Processes with a thread ID of 21379 Processes with a thread ID of 21380 Processes with a thread ID of 21382 Processes with a thread ID of 21383 Processes with a thread ID of 21384 Processes with a thread ID of 21385 Processes with a thread ID of 21386 Processes with a thread ID of 21387 Processes with a thread ID of 21388 Processes with a thread ID of 21389 Processes with a thread ID of 21390 Processes with a thread ID of 21391 tid:21373| tid:21376| tid:21378| tid:21379| tid:21380| tid:21382| tid:21383| tid:21384| tid:21385| tid:21386| tid:21387| tid:21388| tid:21389| tid:21390| tid:21391| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| samples| %| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 82 100.000 3763 100.000 2763 100.000 91 100.000 1 100.000 3 100.000 43 100.000 3 100.000 2 100.000 1 100.000 6 100.000 2 100.000 163 100.000 2 100.000 109761 100.000 java . . . . blah, blah ======================================= It's practically impossible to read such a report manually. The easiest thing for you to do is to pick individual processes (or threads) to focus on, one at a time; for example: Focusing on the first process using 'opreport tgid:21373' I can get the exact same jumbled mess, showing all the individual thread IDs. Notice the profile specification of "tgid:21373'. The "tgid" is "thread group ID", which basically means you're asking opreport to show you all data for that process and its child threads. Since this is Java 1.6, I happen to know that the JVM creates threads to do its work (versus fork/exec which would create new child *processes*). So I then randomly choose one of the other threads in the list above and use "tid" in the profile specification to see profile data for that thread: opreport tid:21378 Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples directory. CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 CPU_CLK_UNHALT...| samples| %| ------------------ 2763 100.000 java CPU_CLK_UNHALT...| samples| %| ------------------ 2554 92.4358 libj9jit24.so 68 2.4611 no-vmlinux 50 1.8096 libc-2.12.so 27 0.9772 libj9vm24.so 24 0.8686 libj9thr24.so 16 0.5791 libj9prt24.so 12 0.4343 libpthread-2.12.so 5 0.1810 libj9hookable24.so ====================== Hope that helps. -Maynard > Best~ > Emily > > > 2013/9/18 Michael Petlan <mp...@re... <mailto:mp...@re...>> > > Hi, > > As I know, the L2_LINES_IN can be used for that, see this reference guide: > http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html> > > The LLC_MISSES should care about the last level cache, it may be the L3. > > I have L2_CACHE_MISS event for this, but maybe you haven't. > > Please take it as a non-official information. > > Regards, > Michael > > > > -------- Original message -------- > Předmět: Re: using oprofile to debug multi-processes programs on linux > Datum: Mon, 16 Sep 2013 08:34:03 -0500 > Od: Maynard Johnson <may...@us... <mailto:may...@us...>> > Komu: benzhi cao <cao...@gm... <mailto:cao...@gm...>> > Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net <mailto:opr...@li...>> > > On 09/14/2013 08:24 PM, benzhi cao wrote: > > Thanks so much for your reply. And I can collect the information for every process now. > Also I want to collect the L2 cache miss, so I try to use ophelp to find the event that > I can use for L2 cache miss. And I think the event is LLC_MISSES. But I also find > some guys who use l2_lines_in to profile l2 cache miss, so I was confused, I don't know > which is the right event? What's more, my hardware is intel architecture 64. > Best~ > Emily > > Adding oprofile-list back to cc so that maybe someone else on the list can help, since Intel is not my primary architecture of expertise. > > -Maynard > > > > 2013/9/14 Maynard Johnson <may...@us... <mailto:may...@us...> <mailto:may...@us... <mailto:may...@us...>>> > > On 09/13/2013 02:18 AM, benzhi cao wrote: > > Hi, can Oprofile be used to profile performance of multi-processes programs ? And if it can, how to see the the performance of each process? (P.S: The online manual shows that it can be used to profile multi-threads programs, but I don't know whether it can be used for multi-processes). Any help will be appreciated, thanks a lot~ > Hi, Emily, > Hopefully, you're using oprofile 0.9.9 so you can use operf instead of the older "legacy" opcontrol commands. Using operf, you can specify to profile just the particular application (or process) you're interested in. If your application does fork/exec to create new child processes, operf will, by default, collect all sample data for the parent and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has some key bug fixes for operf relating to following forked children.) You can specify "--separate-thread" (see operf's man page for details) so that samples are separated by process and thread. If you do collect a --separate-thread profile, be aware that opreport, being a text-based report generator does not handle too many axes of separation very well. You may get a report that looks like a jumbled mess, but would show a list of process IDs near the top of the report. You could use that list of PIDs to generate per-process reports -- e.g., > 'opreport tgi! > > d:<pid! > > #> [option > s]'. In some cases, opreport gives up and tells you that you have to either provide a profile specification (e.g., 'tgid:<pid#">' or, if profiling with multiple events, 'event:<event_name>'). More information on profile specifications can be found at http://oprofile.sourceforge.__net/doc/results.html#profile-__spec <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. > > -Maynard > > > > Best~ > > Emily > > > > > > > > ------------------------------__------------------------------__------------------ > > How ServiceNow helps IT people transform IT departments: > > 1. Consolidate legacy IT systems to a single system of record for IT > > 2. Standardize and globalize service processes across IT > > 3. Implement zero-touch automation to replace manual, redundant tasks > > http://pubads.g.doubleclick.__net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk <http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk> > > > > > > > > _________________________________________________ > > oprofile-list mailing list > > oprofile-list@lists.__sourceforge.net <mailto:opr...@li...> <mailto:oprofile-list@lists.__sourceforge.net <mailto:opr...@li...>> > > https://lists.sourceforge.net/__lists/listinfo/oprofile-list <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > > > > > > > ------------------------------__------------------------------__------------------ > LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint > 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes > Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > http://pubads.g.doubleclick.__net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk <http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk> > > _________________________________________________ > oprofile-list mailing list > oprofile-list@lists.__sourceforge.net <mailto:opr...@li...> > https://lists.sourceforge.net/__lists/listinfo/oprofile-list <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > > > > > > ------------------------------------------------------------------------------ > LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint > 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes > Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > > > > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: benzhi c. <cao...@gm...> - 2013-09-25 06:46:28
|
Thanks a lot, it's very helpful to me. What's more, when I profile with oprofile, I can not know which function call the glibc functions like memmove,(I have already used the --callgraph options, but still no result). Do you know how to do that? Thanks~ Best Emily 2013/9/23, Maynard Johnson <may...@us...>: > On 09/21/2013 05:46 AM, benzhi cao wrote: >> Thanks for your reply. But now I have another questions. When I use 32 >> threads to run my app, and use the opreport to show the results, the >> results were mess, and I cann't see the results easily. >> Do you know how to see the results clearly? > I mentioned in my first response that this likely would be the case. Did > you try the tips I suggested? Here's an example of what I was trying to > say: > > If I use 'operf --separate-thread' to profile a Java 1.6 app, doing > 'opreport' with no options shows the following jumbled mess: > > [mpjohn@oc1757000783 myJavaStuff]$ opreport > Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > directory. > CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 100000 > Processes with a thread ID of 21373 > Processes with a thread ID of 21376 > Processes with a thread ID of 21378 > Processes with a thread ID of 21379 > Processes with a thread ID of 21380 > Processes with a thread ID of 21382 > Processes with a thread ID of 21383 > Processes with a thread ID of 21384 > Processes with a thread ID of 21385 > Processes with a thread ID of 21386 > Processes with a thread ID of 21387 > Processes with a thread ID of 21388 > Processes with a thread ID of 21389 > Processes with a thread ID of 21390 > Processes with a thread ID of 21391 > tid:21373| tid:21376| tid:21378| tid:21379| > tid:21380| tid:21382| tid:21383| tid:21384| > tid:21385| tid:21386| tid:21387| tid:21388| > tid:21389| tid:21390| tid:21391| > samples| %| samples| %| samples| %| samples| %| > samples| %| samples| %| samples| %| samples| %| > samples| %| samples| %| samples| %| samples| %| > samples| %| samples| %| samples| %| > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > 82 100.000 3763 100.000 2763 100.000 91 100.000 > 1 100.000 3 100.000 43 100.000 3 100.000 2 > 100.000 1 100.000 6 100.000 2 100.000 163 > 100.000 2 100.000 109761 100.000 java > > . . . . blah, blah > > ======================================= > > It's practically impossible to read such a report manually. The easiest > thing for you to do is to pick individual processes (or threads) to focus > on, one at a time; for example: > > Focusing on the first process using 'opreport tgid:21373' I can get the > exact same jumbled mess, showing all the individual thread IDs. Notice the > profile specification of "tgid:21373'. The "tgid" is "thread group ID", > which basically means you're asking opreport to show you all data for that > process and its child threads. Since this is Java 1.6, I happen to know that > the JVM creates threads to do its work (versus fork/exec which would create > new child *processes*). So I then randomly choose one of the other threads > in the list above and use "tid" in the profile specification to see profile > data for that thread: > > opreport tid:21378 > Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > directory. > CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 100000 > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 2763 100.000 java > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 2554 92.4358 libj9jit24.so > 68 2.4611 no-vmlinux > 50 1.8096 libc-2.12.so > 27 0.9772 libj9vm24.so > 24 0.8686 libj9thr24.so > 16 0.5791 libj9prt24.so > 12 0.4343 libpthread-2.12.so > 5 0.1810 libj9hookable24.so > > ====================== > > Hope that helps. > > -Maynard > > > > > > >> Best~ >> Emily >> >> >> 2013/9/18 Michael Petlan <mp...@re... <mailto:mp...@re...>> >> >> Hi, >> >> As I know, the L2_LINES_IN can be used for that, see this reference >> guide: >> >> http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html >> <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html> >> >> The LLC_MISSES should care about the last level cache, it may be the >> L3. >> >> I have L2_CACHE_MISS event for this, but maybe you haven't. >> >> Please take it as a non-official information. >> >> Regards, >> Michael >> >> >> >> -------- Original message -------- >> Předmět: Re: using oprofile to debug multi-processes programs on >> linux >> Datum: Mon, 16 Sep 2013 08:34:03 -0500 >> Od: Maynard Johnson <may...@us... >> <mailto:may...@us...>> >> Komu: benzhi cao <cao...@gm... >> <mailto:cao...@gm...>> >> Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net >> <mailto:opr...@li...>> >> >> On 09/14/2013 08:24 PM, benzhi cao wrote: >> >> Thanks so much for your reply. And I can collect the information >> for every process now. >> Also I want to collect the L2 cache miss, so I try to use ophelp >> to find the event that >> I can use for L2 cache miss. And I think the event is LLC_MISSES. >> But I also find >> some guys who use l2_lines_in to profile l2 cache miss, so I was >> confused, I don't know >> which is the right event? What's more, my hardware is intel >> architecture 64. >> Best~ >> Emily >> >> Adding oprofile-list back to cc so that maybe someone else on the list >> can help, since Intel is not my primary architecture of expertise. >> >> -Maynard >> >> >> >> 2013/9/14 Maynard Johnson <may...@us... >> <mailto:may...@us...> <mailto:may...@us... >> <mailto:may...@us...>>> >> >> On 09/13/2013 02:18 AM, benzhi cao wrote: >> > Hi, can Oprofile be used to profile performance of >> multi-processes programs ? And if it can, how to see the the performance >> of each process? (P.S: The online manual shows that it can be used to >> profile multi-threads programs, but I don't know whether it can be used >> for multi-processes). Any help will be appreciated, thanks a lot~ >> Hi, Emily, >> Hopefully, you're using oprofile 0.9.9 so you can use operf >> instead of the older "legacy" opcontrol commands. Using operf, you can >> specify to profile just the particular application (or process) you're >> interested in. If your application does fork/exec to create new child >> processes, operf will, by default, collect all sample data for the parent >> and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has >> some key bug fixes for operf relating to following forked children.) You >> can specify "--separate-thread" (see operf's man page for details) so that >> samples are separated by process and thread. If you do collect a >> --separate-thread profile, be aware that opreport, being a text-based >> report generator does not handle too many axes of separation very well. >> You may get a report that looks like a jumbled mess, but would show a list >> of process IDs near the top of the report. You could use that list of >> PIDs to generate per-process reports -- e.g., >> 'opreport tgi! >> >> d:<pid! >> >> #> [option >> s]'. In some cases, opreport gives up and tells you that you >> have to either provide a profile specification (e.g., 'tgid:<pid#">' or, >> if profiling with multiple events, 'event:<event_name>'). More >> information on profile specifications can be found at >> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec >> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. >> >> -Maynard >> >> >> > Best~ >> > Emily >> > >> > >> > >> > >> ------------------------------__------------------------------__------------------ >> > How ServiceNow helps IT people transform IT departments: >> > 1. Consolidate legacy IT systems to a single system of >> record for IT >> > 2. Standardize and globalize service processes across IT >> > 3. Implement zero-touch automation to replace manual, >> redundant tasks >> > >> http://pubads.g.doubleclick.__net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk >> <http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk> >> > >> > >> > >> > _________________________________________________ >> > oprofile-list mailing list >> > oprofile-list@lists.__sourceforge.net >> <mailto:opr...@li...> >> <mailto:oprofile-list@lists.__sourceforge.net >> <mailto:opr...@li...>> >> > https://lists.sourceforge.net/__lists/listinfo/oprofile-list >> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> >> > >> >> >> >> >> >> ------------------------------__------------------------------__------------------ >> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >> SharePoint >> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >> includes >> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >> >> http://pubads.g.doubleclick.__net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk >> <http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk> >> >> _________________________________________________ >> oprofile-list mailing list >> oprofile-list@lists.__sourceforge.net >> <mailto:opr...@li...> >> https://lists.sourceforge.net/__lists/listinfo/oprofile-list >> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >> SharePoint >> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >> includes >> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk >> >> >> >> _______________________________________________ >> oprofile-list mailing list >> opr...@li... >> https://lists.sourceforge.net/lists/listinfo/oprofile-list >> > > |
From: benzhi c. <cao...@gm...> - 2013-09-25 07:45:10
|
By the way, I also think the callgraph is not accurate. It's that true? 2013/9/25 benzhi cao <cao...@gm...> > Thanks a lot, it's very helpful to me. > What's more, when I profile with oprofile, I can not know which > function call the glibc functions like memmove,(I have already used > the --callgraph options, but still no result). Do you know how to do > that? Thanks~ > Best > Emily > > > 2013/9/23, Maynard Johnson <may...@us...>: > > On 09/21/2013 05:46 AM, benzhi cao wrote: > >> Thanks for your reply. But now I have another questions. When I use 32 > >> threads to run my app, and use the opreport to show the results, the > >> results were mess, and I cann't see the results easily. > >> Do you know how to see the results clearly? > > I mentioned in my first response that this likely would be the case. Did > > you try the tips I suggested? Here's an example of what I was trying to > > say: > > > > If I use 'operf --separate-thread' to profile a Java 1.6 app, doing > > 'opreport' with no options shows the following jumbled mess: > > > > [mpjohn@oc1757000783 myJavaStuff]$ opreport > > Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > > directory. > > CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz > (estimated) > > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a > unit > > mask of 0x00 (No unit mask) count 100000 > > Processes with a thread ID of 21373 > > Processes with a thread ID of 21376 > > Processes with a thread ID of 21378 > > Processes with a thread ID of 21379 > > Processes with a thread ID of 21380 > > Processes with a thread ID of 21382 > > Processes with a thread ID of 21383 > > Processes with a thread ID of 21384 > > Processes with a thread ID of 21385 > > Processes with a thread ID of 21386 > > Processes with a thread ID of 21387 > > Processes with a thread ID of 21388 > > Processes with a thread ID of 21389 > > Processes with a thread ID of 21390 > > Processes with a thread ID of 21391 > > tid:21373| tid:21376| tid:21378| tid:21379| > > tid:21380| tid:21382| tid:21383| tid:21384| > > tid:21385| tid:21386| tid:21387| tid:21388| > > tid:21389| tid:21390| tid:21391| > > samples| %| samples| %| samples| %| samples| %| > > samples| %| samples| %| samples| %| samples| %| > > samples| %| samples| %| samples| %| samples| %| > > samples| %| samples| %| samples| %| > > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > 82 100.000 3763 100.000 2763 100.000 91 100.000 > > 1 100.000 3 100.000 43 100.000 3 100.000 > 2 > > 100.000 1 100.000 6 100.000 2 100.000 163 > > 100.000 2 100.000 109761 100.000 java > > > > . . . . blah, blah > > > > ======================================= > > > > It's practically impossible to read such a report manually. The easiest > > thing for you to do is to pick individual processes (or threads) to focus > > on, one at a time; for example: > > > > Focusing on the first process using 'opreport tgid:21373' I can get the > > exact same jumbled mess, showing all the individual thread IDs. Notice > the > > profile specification of "tgid:21373'. The "tgid" is "thread group ID", > > which basically means you're asking opreport to show you all data for > that > > process and its child threads. Since this is Java 1.6, I happen to know > that > > the JVM creates threads to do its work (versus fork/exec which would > create > > new child *processes*). So I then randomly choose one of the other > threads > > in the list above and use "tid" in the profile specification to see > profile > > data for that thread: > > > > opreport tid:21378 > > Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > > directory. > > CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz > (estimated) > > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a > unit > > mask of 0x00 (No unit mask) count 100000 > > CPU_CLK_UNHALT...| > > samples| %| > > ------------------ > > 2763 100.000 java > > CPU_CLK_UNHALT...| > > samples| %| > > ------------------ > > 2554 92.4358 libj9jit24.so > > 68 2.4611 no-vmlinux > > 50 1.8096 libc-2.12.so > > 27 0.9772 libj9vm24.so > > 24 0.8686 libj9thr24.so > > 16 0.5791 libj9prt24.so > > 12 0.4343 libpthread-2.12.so > > 5 0.1810 libj9hookable24.so > > > > ====================== > > > > Hope that helps. > > > > -Maynard > > > > > > > > > > > > > >> Best~ > >> Emily > >> > >> > >> 2013/9/18 Michael Petlan <mp...@re... <mailto:mp...@re... > >> > >> > >> Hi, > >> > >> As I know, the L2_LINES_IN can be used for that, see this reference > >> guide: > >> > >> > http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html > >> < > http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html > > > >> > >> The LLC_MISSES should care about the last level cache, it may be the > >> L3. > >> > >> I have L2_CACHE_MISS event for this, but maybe you haven't. > >> > >> Please take it as a non-official information. > >> > >> Regards, > >> Michael > >> > >> > >> > >> -------- Original message -------- > >> Předmět: Re: using oprofile to debug multi-processes programs on > >> linux > >> Datum: Mon, 16 Sep 2013 08:34:03 -0500 > >> Od: Maynard Johnson <may...@us... > >> <mailto:may...@us...>> > >> Komu: benzhi cao <cao...@gm... > >> <mailto:cao...@gm...>> > >> Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net > >> <mailto:opr...@li...>> > >> > >> On 09/14/2013 08:24 PM, benzhi cao wrote: > >> > >> Thanks so much for your reply. And I can collect the information > >> for every process now. > >> Also I want to collect the L2 cache miss, so I try to use ophelp > >> to find the event that > >> I can use for L2 cache miss. And I think the event is > LLC_MISSES. > >> But I also find > >> some guys who use l2_lines_in to profile l2 cache miss, so I was > >> confused, I don't know > >> which is the right event? What's more, my hardware is intel > >> architecture 64. > >> Best~ > >> Emily > >> > >> Adding oprofile-list back to cc so that maybe someone else on the > list > >> can help, since Intel is not my primary architecture of expertise. > >> > >> -Maynard > >> > >> > >> > >> 2013/9/14 Maynard Johnson <may...@us... > >> <mailto:may...@us...> <mailto:may...@us... > >> <mailto:may...@us...>>> > >> > >> On 09/13/2013 02:18 AM, benzhi cao wrote: > >> > Hi, can Oprofile be used to profile performance of > >> multi-processes programs ? And if it can, how to see the the performance > >> of each process? (P.S: The online manual shows that it can be used to > >> profile multi-threads programs, but I don't know whether it can be used > >> for multi-processes). Any help will be appreciated, thanks a lot~ > >> Hi, Emily, > >> Hopefully, you're using oprofile 0.9.9 so you can use operf > >> instead of the older "legacy" opcontrol commands. Using operf, you can > >> specify to profile just the particular application (or process) you're > >> interested in. If your application does fork/exec to create new child > >> processes, operf will, by default, collect all sample data for the > parent > >> and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has > >> some key bug fixes for operf relating to following forked children.) > You > >> can specify "--separate-thread" (see operf's man page for details) so > that > >> samples are separated by process and thread. If you do collect a > >> --separate-thread profile, be aware that opreport, being a text-based > >> report generator does not handle too many axes of separation very well. > >> You may get a report that looks like a jumbled mess, but would show a > list > >> of process IDs near the top of the report. You could use that list of > >> PIDs to generate per-process reports -- e.g., > >> 'opreport tgi! > >> > >> d:<pid! > >> > >> #> [option > >> s]'. In some cases, opreport gives up and tells you that > you > >> have to either provide a profile specification (e.g., 'tgid:<pid#">' or, > >> if profiling with multiple events, 'event:<event_name>'). More > >> information on profile specifications can be found at > >> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec > >> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. > >> > >> -Maynard > >> > >> > >> > Best~ > >> > Emily > >> > > >> > > >> > > >> > > >> > ------------------------------__------------------------------__------------------ > >> > How ServiceNow helps IT people transform IT departments: > >> > 1. Consolidate legacy IT systems to a single system of > >> record for IT > >> > 2. Standardize and globalize service processes across IT > >> > 3. Implement zero-touch automation to replace manual, > >> redundant tasks > >> > > >> http://pubads.g.doubleclick. > __net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk > >> < > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk > > > >> > > >> > > >> > > >> > _________________________________________________ > >> > oprofile-list mailing list > >> > oprofile-list@lists.__sourceforge.net > >> <mailto:opr...@li...> > >> <mailto:oprofile-list@lists.__sourceforge.net > >> <mailto:opr...@li...>> > >> > > https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >> > > >> > >> > >> > >> > >> > >> > ------------------------------__------------------------------__------------------ > >> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > >> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >> SharePoint > >> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power > Pack > >> includes > >> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > >> > >> http://pubads.g.doubleclick. > __net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk > >> < > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > > > >> > >> _________________________________________________ > >> oprofile-list mailing list > >> oprofile-list@lists.__sourceforge.net > >> <mailto:opr...@li...> > >> https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > >> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >> SharePoint > >> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack > >> includes > >> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > >> > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > >> > >> > >> > >> _______________________________________________ > >> oprofile-list mailing list > >> opr...@li... > >> https://lists.sourceforge.net/lists/listinfo/oprofile-list > >> > > > > > |
From: Maynard J. <may...@us...> - 2013-09-25 13:22:10
|
On 09/25/2013 01:46 AM, benzhi cao wrote: > Thanks a lot, it's very helpful to me. > What's more, when I profile with oprofile, I can not know which > function call the glibc functions like memmove,(I have already used > the --callgraph options, but still no result). Do you know how to do > that? Thanks~ Please be specific by telling us the commands you're using, the results you get, and what you think is wrong. The callgraph option works *mostly*. There are some corner cases mis-handled. For example, see http://oprofile.sourceforge.net/doc/interpreting-callgraph.html. -Maynard > Best > Emily > > > 2013/9/23, Maynard Johnson <may...@us...>: >> On 09/21/2013 05:46 AM, benzhi cao wrote: >>> Thanks for your reply. But now I have another questions. When I use 32 >>> threads to run my app, and use the opreport to show the results, the >>> results were mess, and I cann't see the results easily. >>> Do you know how to see the results clearly? >> I mentioned in my first response that this likely would be the case. Did >> you try the tips I suggested? Here's an example of what I was trying to >> say: >> >> If I use 'operf --separate-thread' to profile a Java 1.6 app, doing >> 'opreport' with no options shows the following jumbled mess: >> >> [mpjohn@oc1757000783 myJavaStuff]$ opreport >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples >> directory. >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit >> mask of 0x00 (No unit mask) count 100000 >> Processes with a thread ID of 21373 >> Processes with a thread ID of 21376 >> Processes with a thread ID of 21378 >> Processes with a thread ID of 21379 >> Processes with a thread ID of 21380 >> Processes with a thread ID of 21382 >> Processes with a thread ID of 21383 >> Processes with a thread ID of 21384 >> Processes with a thread ID of 21385 >> Processes with a thread ID of 21386 >> Processes with a thread ID of 21387 >> Processes with a thread ID of 21388 >> Processes with a thread ID of 21389 >> Processes with a thread ID of 21390 >> Processes with a thread ID of 21391 >> tid:21373| tid:21376| tid:21378| tid:21379| >> tid:21380| tid:21382| tid:21383| tid:21384| >> tid:21385| tid:21386| tid:21387| tid:21388| >> tid:21389| tid:21390| tid:21391| >> samples| %| samples| %| samples| %| samples| %| >> samples| %| samples| %| samples| %| samples| %| >> samples| %| samples| %| samples| %| samples| %| >> samples| %| samples| %| samples| %| >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> 82 100.000 3763 100.000 2763 100.000 91 100.000 >> 1 100.000 3 100.000 43 100.000 3 100.000 2 >> 100.000 1 100.000 6 100.000 2 100.000 163 >> 100.000 2 100.000 109761 100.000 java >> >> . . . . blah, blah >> >> ======================================= >> >> It's practically impossible to read such a report manually. The easiest >> thing for you to do is to pick individual processes (or threads) to focus >> on, one at a time; for example: >> >> Focusing on the first process using 'opreport tgid:21373' I can get the >> exact same jumbled mess, showing all the individual thread IDs. Notice the >> profile specification of "tgid:21373'. The "tgid" is "thread group ID", >> which basically means you're asking opreport to show you all data for that >> process and its child threads. Since this is Java 1.6, I happen to know that >> the JVM creates threads to do its work (versus fork/exec which would create >> new child *processes*). So I then randomly choose one of the other threads >> in the list above and use "tid" in the profile specification to see profile >> data for that thread: >> >> opreport tid:21378 >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples >> directory. >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit >> mask of 0x00 (No unit mask) count 100000 >> CPU_CLK_UNHALT...| >> samples| %| >> ------------------ >> 2763 100.000 java >> CPU_CLK_UNHALT...| >> samples| %| >> ------------------ >> 2554 92.4358 libj9jit24.so >> 68 2.4611 no-vmlinux >> 50 1.8096 libc-2.12.so >> 27 0.9772 libj9vm24.so >> 24 0.8686 libj9thr24.so >> 16 0.5791 libj9prt24.so >> 12 0.4343 libpthread-2.12.so >> 5 0.1810 libj9hookable24.so >> >> ====================== >> >> Hope that helps. >> >> -Maynard >> >> >> >> >> >> >>> Best~ >>> Emily >>> >>> >>> 2013/9/18 Michael Petlan <mp...@re... <mailto:mp...@re...>> >>> >>> Hi, >>> >>> As I know, the L2_LINES_IN can be used for that, see this reference >>> guide: >>> >>> http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html >>> <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html> >>> >>> The LLC_MISSES should care about the last level cache, it may be the >>> L3. >>> >>> I have L2_CACHE_MISS event for this, but maybe you haven't. >>> >>> Please take it as a non-official information. >>> >>> Regards, >>> Michael >>> >>> >>> >>> -------- Original message -------- >>> Předmět: Re: using oprofile to debug multi-processes programs on >>> linux >>> Datum: Mon, 16 Sep 2013 08:34:03 -0500 >>> Od: Maynard Johnson <may...@us... >>> <mailto:may...@us...>> >>> Komu: benzhi cao <cao...@gm... >>> <mailto:cao...@gm...>> >>> Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net >>> <mailto:opr...@li...>> >>> >>> On 09/14/2013 08:24 PM, benzhi cao wrote: >>> >>> Thanks so much for your reply. And I can collect the information >>> for every process now. >>> Also I want to collect the L2 cache miss, so I try to use ophelp >>> to find the event that >>> I can use for L2 cache miss. And I think the event is LLC_MISSES. >>> But I also find >>> some guys who use l2_lines_in to profile l2 cache miss, so I was >>> confused, I don't know >>> which is the right event? What's more, my hardware is intel >>> architecture 64. >>> Best~ >>> Emily >>> >>> Adding oprofile-list back to cc so that maybe someone else on the list >>> can help, since Intel is not my primary architecture of expertise. >>> >>> -Maynard >>> >>> >>> >>> 2013/9/14 Maynard Johnson <may...@us... >>> <mailto:may...@us...> <mailto:may...@us... >>> <mailto:may...@us...>>> >>> >>> On 09/13/2013 02:18 AM, benzhi cao wrote: >>> > Hi, can Oprofile be used to profile performance of >>> multi-processes programs ? And if it can, how to see the the performance >>> of each process? (P.S: The online manual shows that it can be used to >>> profile multi-threads programs, but I don't know whether it can be used >>> for multi-processes). Any help will be appreciated, thanks a lot~ >>> Hi, Emily, >>> Hopefully, you're using oprofile 0.9.9 so you can use operf >>> instead of the older "legacy" opcontrol commands. Using operf, you can >>> specify to profile just the particular application (or process) you're >>> interested in. If your application does fork/exec to create new child >>> processes, operf will, by default, collect all sample data for the parent >>> and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has >>> some key bug fixes for operf relating to following forked children.) You >>> can specify "--separate-thread" (see operf's man page for details) so that >>> samples are separated by process and thread. If you do collect a >>> --separate-thread profile, be aware that opreport, being a text-based >>> report generator does not handle too many axes of separation very well. >>> You may get a report that looks like a jumbled mess, but would show a list >>> of process IDs near the top of the report. You could use that list of >>> PIDs to generate per-process reports -- e.g., >>> 'opreport tgi! >>> >>> d:<pid! >>> >>> #> [option >>> s]'. In some cases, opreport gives up and tells you that you >>> have to either provide a profile specification (e.g., 'tgid:<pid#">' or, >>> if profiling with multiple events, 'event:<event_name>'). More >>> information on profile specifications can be found at >>> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec >>> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. >>> >>> -Maynard >>> >>> >>> > Best~ >>> > Emily >>> > >>> > >>> > >>> > >>> ------------------------------__------------------------------__------------------ >>> > How ServiceNow helps IT people transform IT departments: >>> > 1. Consolidate legacy IT systems to a single system of >>> record for IT >>> > 2. Standardize and globalize service processes across IT >>> > 3. Implement zero-touch automation to replace manual, >>> redundant tasks >>> > >>> http://pubads.g.doubleclick.__net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk >>> <http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk> >>> > >>> > >>> > >>> > _________________________________________________ >>> > oprofile-list mailing list >>> > oprofile-list@lists.__sourceforge.net >>> <mailto:opr...@li...> >>> <mailto:oprofile-list@lists.__sourceforge.net >>> <mailto:opr...@li...>> >>> > https://lists.sourceforge.net/__lists/listinfo/oprofile-list >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> >>> > >>> >>> >>> >>> >>> >>> ------------------------------__------------------------------__------------------ >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>> SharePoint >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>> includes >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>> >>> http://pubads.g.doubleclick.__net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk >>> <http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk> >>> >>> _________________________________________________ >>> oprofile-list mailing list >>> oprofile-list@lists.__sourceforge.net >>> <mailto:opr...@li...> >>> https://lists.sourceforge.net/__lists/listinfo/oprofile-list >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, >>> SharePoint >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack >>> includes >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. >>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk >>> >>> >>> >>> _______________________________________________ >>> oprofile-list mailing list >>> opr...@li... >>> https://lists.sourceforge.net/lists/listinfo/oprofile-list >>> >> >> > |
From: benzhi c. <cao...@gm...> - 2013-09-25 15:07:39
|
Thanks for your remind, I forgot to write the details. The commands I use were as follows: 1.sudo opcontrol --setup --vmlinux=/home/ssg/vmlinux --separate=lib,thread,kernel --event=CPU_CLK_UNHALTED:100000 --callgraph=10 2.sudo opcontrol --reset 3.sudo opcontrol --start 4.run my_program. 5.sudo opcontrol --stop 6.opreport -l --callgraph=10 --merge=tgid ./my_program | less (I use the legacy mode instead of the operf due to I use many signal event in my program. And I tried use the operf, it doesn't work. So I can only use the legacy mode to profile.) The result I get like follows: samples % image name symbol name ------------------------------------------------------------------------------- 133 0.3858 wc mr_worker 34345 99.6142 wc out_cmp 35263 20.9390 libc-2.15.so __memset_sse2 35263 100.000 libc-2.15.so __memset_sse2 [self] according to the online manual, it means out_cmp function calls memset functions. But my out_cmp function is just strcmp. there is no memset functions at all, So I think it is strange. So what do you think? Any help would be appreciated. Thanks a lot~ Best Emily 2013/9/25 Maynard Johnson <may...@us...> > On 09/25/2013 01:46 AM, benzhi cao wrote: > > Thanks a lot, it's very helpful to me. > > What's more, when I profile with oprofile, I can not know which > > function call the glibc functions like memmove,(I have already used > > the --callgraph options, but still no result). Do you know how to do > > that? Thanks~ > Please be specific by telling us the commands you're using, the results > you get, and what you think is wrong. The callgraph option works *mostly*. > There are some corner cases mis-handled. For example, see > http://oprofile.sourceforge.net/doc/interpreting-callgraph.html. > > -Maynard > > Best > > Emily > > > > > > 2013/9/23, Maynard Johnson <may...@us...>: > >> On 09/21/2013 05:46 AM, benzhi cao wrote: > >>> Thanks for your reply. But now I have another questions. When I use 32 > >>> threads to run my app, and use the opreport to show the results, the > >>> results were mess, and I cann't see the results easily. > >>> Do you know how to see the results clearly? > >> I mentioned in my first response that this likely would be the case. > Did > >> you try the tips I suggested? Here's an example of what I was trying to > >> say: > >> > >> If I use 'operf --separate-thread' to profile a Java 1.6 app, doing > >> 'opreport' with no options shows the following jumbled mess: > >> > >> [mpjohn@oc1757000783 myJavaStuff]$ opreport > >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > >> directory. > >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz > (estimated) > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a > unit > >> mask of 0x00 (No unit mask) count 100000 > >> Processes with a thread ID of 21373 > >> Processes with a thread ID of 21376 > >> Processes with a thread ID of 21378 > >> Processes with a thread ID of 21379 > >> Processes with a thread ID of 21380 > >> Processes with a thread ID of 21382 > >> Processes with a thread ID of 21383 > >> Processes with a thread ID of 21384 > >> Processes with a thread ID of 21385 > >> Processes with a thread ID of 21386 > >> Processes with a thread ID of 21387 > >> Processes with a thread ID of 21388 > >> Processes with a thread ID of 21389 > >> Processes with a thread ID of 21390 > >> Processes with a thread ID of 21391 > >> tid:21373| tid:21376| tid:21378| tid:21379| > >> tid:21380| tid:21382| tid:21383| tid:21384| > >> tid:21385| tid:21386| tid:21387| tid:21388| > >> tid:21389| tid:21390| tid:21391| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| > >> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > >> 82 100.000 3763 100.000 2763 100.000 91 100.000 > >> 1 100.000 3 100.000 43 100.000 3 100.000 > 2 > >> 100.000 1 100.000 6 100.000 2 100.000 163 > >> 100.000 2 100.000 109761 100.000 java > >> > >> . . . . blah, blah > >> > >> ======================================= > >> > >> It's practically impossible to read such a report manually. The easiest > >> thing for you to do is to pick individual processes (or threads) to > focus > >> on, one at a time; for example: > >> > >> Focusing on the first process using 'opreport tgid:21373' I can get the > >> exact same jumbled mess, showing all the individual thread IDs. Notice > the > >> profile specification of "tgid:21373'. The "tgid" is "thread group ID", > >> which basically means you're asking opreport to show you all data for > that > >> process and its child threads. Since this is Java 1.6, I happen to know > that > >> the JVM creates threads to do its work (versus fork/exec which would > create > >> new child *processes*). So I then randomly choose one of the other > threads > >> in the list above and use "tid" in the profile specification to see > profile > >> data for that thread: > >> > >> opreport tid:21378 > >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > >> directory. > >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz > (estimated) > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a > unit > >> mask of 0x00 (No unit mask) count 100000 > >> CPU_CLK_UNHALT...| > >> samples| %| > >> ------------------ > >> 2763 100.000 java > >> CPU_CLK_UNHALT...| > >> samples| %| > >> ------------------ > >> 2554 92.4358 libj9jit24.so > >> 68 2.4611 no-vmlinux > >> 50 1.8096 libc-2.12.so > >> 27 0.9772 libj9vm24.so > >> 24 0.8686 libj9thr24.so > >> 16 0.5791 libj9prt24.so > >> 12 0.4343 libpthread-2.12.so > >> 5 0.1810 libj9hookable24.so > >> > >> ====================== > >> > >> Hope that helps. > >> > >> -Maynard > >> > >> > >> > >> > >> > >> > >>> Best~ > >>> Emily > >>> > >>> > >>> 2013/9/18 Michael Petlan <mp...@re... <mailto: > mp...@re...>> > >>> > >>> Hi, > >>> > >>> As I know, the L2_LINES_IN can be used for that, see this reference > >>> guide: > >>> > >>> > http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html > >>> < > http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html > > > >>> > >>> The LLC_MISSES should care about the last level cache, it may be > the > >>> L3. > >>> > >>> I have L2_CACHE_MISS event for this, but maybe you haven't. > >>> > >>> Please take it as a non-official information. > >>> > >>> Regards, > >>> Michael > >>> > >>> > >>> > >>> -------- Original message -------- > >>> Předmět: Re: using oprofile to debug multi-processes programs on > >>> linux > >>> Datum: Mon, 16 Sep 2013 08:34:03 -0500 > >>> Od: Maynard Johnson <may...@us... > >>> <mailto:may...@us...>> > >>> Komu: benzhi cao <cao...@gm... > >>> <mailto:cao...@gm...>> > >>> Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net > >>> <mailto:opr...@li...>> > >>> > >>> On 09/14/2013 08:24 PM, benzhi cao wrote: > >>> > >>> Thanks so much for your reply. And I can collect the > information > >>> for every process now. > >>> Also I want to collect the L2 cache miss, so I try to use > ophelp > >>> to find the event that > >>> I can use for L2 cache miss. And I think the event is > LLC_MISSES. > >>> But I also find > >>> some guys who use l2_lines_in to profile l2 cache miss, so I > was > >>> confused, I don't know > >>> which is the right event? What's more, my hardware is intel > >>> architecture 64. > >>> Best~ > >>> Emily > >>> > >>> Adding oprofile-list back to cc so that maybe someone else on the > list > >>> can help, since Intel is not my primary architecture of expertise. > >>> > >>> -Maynard > >>> > >>> > >>> > >>> 2013/9/14 Maynard Johnson <may...@us... > >>> <mailto:may...@us...> <mailto:may...@us... > >>> <mailto:may...@us...>>> > >>> > >>> On 09/13/2013 02:18 AM, benzhi cao wrote: > >>> > Hi, can Oprofile be used to profile performance of > >>> multi-processes programs ? And if it can, how to see the the > performance > >>> of each process? (P.S: The online manual shows that it can be used to > >>> profile multi-threads programs, but I don't know whether it can be > used > >>> for multi-processes). Any help will be appreciated, thanks a lot~ > >>> Hi, Emily, > >>> Hopefully, you're using oprofile 0.9.9 so you can use operf > >>> instead of the older "legacy" opcontrol commands. Using operf, you can > >>> specify to profile just the particular application (or process) you're > >>> interested in. If your application does fork/exec to create new child > >>> processes, operf will, by default, collect all sample data for the > parent > >>> and children, but will aggregate all sample data. (ATTENTION: 0.9.9 > has > >>> some key bug fixes for operf relating to following forked children.) > You > >>> can specify "--separate-thread" (see operf's man page for details) so > that > >>> samples are separated by process and thread. If you do collect a > >>> --separate-thread profile, be aware that opreport, being a text-based > >>> report generator does not handle too many axes of separation very well. > >>> You may get a report that looks like a jumbled mess, but would show a > list > >>> of process IDs near the top of the report. You could use that list of > >>> PIDs to generate per-process reports -- e.g., > >>> 'opreport tgi! > >>> > >>> d:<pid! > >>> > >>> #> [option > >>> s]'. In some cases, opreport gives up and tells you that > you > >>> have to either provide a profile specification (e.g., 'tgid:<pid#">' > or, > >>> if profiling with multiple events, 'event:<event_name>'). More > >>> information on profile specifications can be found at > >>> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec > >>> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. > >>> > >>> -Maynard > >>> > >>> > >>> > Best~ > >>> > Emily > >>> > > >>> > > >>> > > >>> > > >>> > ------------------------------__------------------------------__------------------ > >>> > How ServiceNow helps IT people transform IT departments: > >>> > 1. Consolidate legacy IT systems to a single system of > >>> record for IT > >>> > 2. Standardize and globalize service processes across IT > >>> > 3. Implement zero-touch automation to replace manual, > >>> redundant tasks > >>> > > >>> http://pubads.g.doubleclick. > __net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk > >>> < > http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk > > > >>> > > >>> > > >>> > > >>> > _________________________________________________ > >>> > oprofile-list mailing list > >>> > oprofile-list@lists.__sourceforge.net > >>> <mailto:opr...@li...> > >>> <mailto:oprofile-list@lists.__sourceforge.net > >>> <mailto:opr...@li...>> > >>> > > https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >>> > > >>> > >>> > >>> > >>> > >>> > >>> > ------------------------------__------------------------------__------------------ > >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just > $49.99! > >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >>> SharePoint > >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power > Pack > >>> includes > >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends > 9/20/13. > >>> > >>> http://pubads.g.doubleclick. > __net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk > >>> < > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > > > >>> > >>> _________________________________________________ > >>> oprofile-list mailing list > >>> oprofile-list@lists.__sourceforge.net > >>> <mailto:opr...@li...> > >>> https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >>> > >>> > >>> > >>> > >>> > >>> > ------------------------------------------------------------------------------ > >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >>> SharePoint > >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack > >>> includes > >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > >>> > http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > >>> > >>> > >>> > >>> _______________________________________________ > >>> oprofile-list mailing list > >>> opr...@li... > >>> https://lists.sourceforge.net/lists/listinfo/oprofile-list > >>> > >> > >> > > > > |
From: Maynard J. <may...@us...> - 2013-09-25 16:24:28
|
On 09/25/2013 10:07 AM, benzhi cao wrote: > Thanks for your remind, I forgot to write the details. The commands I use were as follows: > 1.sudo opcontrol --setup --vmlinux=/home/ssg/vmlinux --separate=lib,thread,kernel --event=CPU_CLK_UNHALTED:100000 --callgraph=10 > 2.sudo opcontrol --reset > 3.sudo opcontrol --start > 4.run my_program. > 5.sudo opcontrol --stop > 6.opreport -l --callgraph=10 --merge=tgid ./my_program | less > (I use the legacy mode instead of the operf due to I use many signal event in my program. And I tried use the operf, it doesn't work. So I can only use > the legacy mode to profile.) > The result I get like follows: > samples % image name symbol name > ------------------------------------------------------------------------------- > 133 0.3858 wc mr_worker > 34345 99.6142 wc out_cmp > 35263 20.9390 libc-2.15.so <http://libc-2.15.so> __memset_sse2 > 35263 100.000 libc-2.15.so <http://libc-2.15.so> __memset_sse2 [self] > according to the online manual, it means out_cmp function calls memset functions. But my out_cmp function is just strcmp. there is no memset functions at all, So I think it is strange. So what do you think? Any help would be appreciated. Thanks a lot~ First, a question . . . What version of oprofile are you using? It's probably unlikely that strcmp is calling memset (and resulting in the issue described in http://oprofile.sourceforge.net/doc/interpreting-callgraph.html). But without seeing the source of your out_cmp function, I can't guess what else might be involved. Another possibility is that some sample information was lost during profiling, causing opreport to falsely conclude that out_cmp calls memset. Did you see any messages about lost/dropped samples or overflows? By the way, we're not really very interested in opcontrol anymore since it has been deprecated for the last two releases. By next release, it will be gone. You said above that operf does not work with your program because of signals used by that program. Please post a new thread to the list and completely describe the problem and how we can reproduce it. Thanks. -Maynard > Best > Emily > > > > > > 2013/9/25 Maynard Johnson <may...@us... <mailto:may...@us...>> > > On 09/25/2013 01:46 AM, benzhi cao wrote: > > Thanks a lot, it's very helpful to me. > > What's more, when I profile with oprofile, I can not know which > > function call the glibc functions like memmove,(I have already used > > the --callgraph options, but still no result). Do you know how to do > > that? Thanks~ > Please be specific by telling us the commands you're using, the results you get, and what you think is wrong. The callgraph option works *mostly*. There are some corner cases mis-handled. For example, see http://oprofile.sourceforge.net/doc/interpreting-callgraph.html. > > -Maynard > > Best > > Emily > > > > > > 2013/9/23, Maynard Johnson <may...@us... <mailto:may...@us...>>: > >> On 09/21/2013 05:46 AM, benzhi cao wrote: > >>> Thanks for your reply. But now I have another questions. When I use 32 > >>> threads to run my app, and use the opreport to show the results, the > >>> results were mess, and I cann't see the results easily. > >>> Do you know how to see the results clearly? > >> I mentioned in my first response that this likely would be the case. Did > >> you try the tips I suggested? Here's an example of what I was trying to > >> say: > >> > >> If I use 'operf --separate-thread' to profile a Java 1.6 app, doing > >> 'opreport' with no options shows the following jumbled mess: > >> > >> [mpjohn@oc1757000783 myJavaStuff]$ opreport > >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > >> directory. > >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > >> mask of 0x00 (No unit mask) count 100000 > >> Processes with a thread ID of 21373 > >> Processes with a thread ID of 21376 > >> Processes with a thread ID of 21378 > >> Processes with a thread ID of 21379 > >> Processes with a thread ID of 21380 > >> Processes with a thread ID of 21382 > >> Processes with a thread ID of 21383 > >> Processes with a thread ID of 21384 > >> Processes with a thread ID of 21385 > >> Processes with a thread ID of 21386 > >> Processes with a thread ID of 21387 > >> Processes with a thread ID of 21388 > >> Processes with a thread ID of 21389 > >> Processes with a thread ID of 21390 > >> Processes with a thread ID of 21391 > >> tid:21373| tid:21376| tid:21378| tid:21379| > >> tid:21380| tid:21382| tid:21383| tid:21384| > >> tid:21385| tid:21386| tid:21387| tid:21388| > >> tid:21389| tid:21390| tid:21391| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| samples| %| > >> samples| %| samples| %| samples| %| > >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > >> 82 100.000 3763 100.000 2763 100.000 91 100.000 > >> 1 100.000 3 100.000 43 100.000 3 100.000 2 > >> 100.000 1 100.000 6 100.000 2 100.000 163 > >> 100.000 2 100.000 109761 100.000 java > >> > >> . . . . blah, blah > >> > >> ======================================= > >> > >> It's practically impossible to read such a report manually. The easiest > >> thing for you to do is to pick individual processes (or threads) to focus > >> on, one at a time; for example: > >> > >> Focusing on the first process using 'opreport tgid:21373' I can get the > >> exact same jumbled mess, showing all the individual thread IDs. Notice the > >> profile specification of "tgid:21373'. The "tgid" is "thread group ID", > >> which basically means you're asking opreport to show you all data for that > >> process and its child threads. Since this is Java 1.6, I happen to know that > >> the JVM creates threads to do its work (versus fork/exec which would create > >> new child *processes*). So I then randomly choose one of the other threads > >> in the list above and use "tid" in the profile specification to see profile > >> data for that thread: > >> > >> opreport tid:21378 > >> Using /home/mpjohn/myJavaStuff/oprofile_data/samples/ for samples > >> directory. > >> CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated) > >> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > >> mask of 0x00 (No unit mask) count 100000 > >> CPU_CLK_UNHALT...| > >> samples| %| > >> ------------------ > >> 2763 100.000 java > >> CPU_CLK_UNHALT...| > >> samples| %| > >> ------------------ > >> 2554 92.4358 libj9jit24.so > >> 68 2.4611 no-vmlinux > >> 50 1.8096 libc-2.12.so <http://libc-2.12.so> > >> 27 0.9772 libj9vm24.so > >> 24 0.8686 libj9thr24.so > >> 16 0.5791 libj9prt24.so > >> 12 0.4343 libpthread-2.12.so <http://libpthread-2.12.so> > >> 5 0.1810 libj9hookable24.so > >> > >> ====================== > >> > >> Hope that helps. > >> > >> -Maynard > >> > >> > >> > >> > >> > >> > >>> Best~ > >>> Emily > >>> > >>> > >>> 2013/9/18 Michael Petlan <mp...@re... <mailto:mp...@re...> <mailto:mp...@re... <mailto:mp...@re...>>> > >>> > >>> Hi, > >>> > >>> As I know, the L2_LINES_IN can be used for that, see this reference > >>> guide: > >>> > >>> http://software.intel.com/__sites/products/documentation/__doclib/stdxe/2013/amplifierxe/__win/win_reference/pmp/events/__about_l2_cache_events.html > >>> <http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/amplifierxe/win/win_reference/pmp/events/about_l2_cache_events.html> > >>> > >>> The LLC_MISSES should care about the last level cache, it may be the > >>> L3. > >>> > >>> I have L2_CACHE_MISS event for this, but maybe you haven't. > >>> > >>> Please take it as a non-official information. > >>> > >>> Regards, > >>> Michael > >>> > >>> > >>> > >>> -------- Original message -------- > >>> Předmět: Re: using oprofile to debug multi-processes programs on > >>> linux > >>> Datum: Mon, 16 Sep 2013 08:34:03 -0500 > >>> Od: Maynard Johnson <may...@us... <mailto:may...@us...> > >>> <mailto:may...@us... <mailto:may...@us...>>> > >>> Komu: benzhi cao <cao...@gm... <mailto:cao...@gm...> > >>> <mailto:cao...@gm... <mailto:cao...@gm...>>> > >>> Kopie: oprofile-list <oprofile-list@lists.__sourceforge.net <http://sourceforge.net> > >>> <mailto:opr...@li... <mailto:opr...@li...>>> > >>> > >>> On 09/14/2013 08:24 PM, benzhi cao wrote: > >>> > >>> Thanks so much for your reply. And I can collect the information > >>> for every process now. > >>> Also I want to collect the L2 cache miss, so I try to use ophelp > >>> to find the event that > >>> I can use for L2 cache miss. And I think the event is LLC_MISSES. > >>> But I also find > >>> some guys who use l2_lines_in to profile l2 cache miss, so I was > >>> confused, I don't know > >>> which is the right event? What's more, my hardware is intel > >>> architecture 64. > >>> Best~ > >>> Emily > >>> > >>> Adding oprofile-list back to cc so that maybe someone else on the list > >>> can help, since Intel is not my primary architecture of expertise. > >>> > >>> -Maynard > >>> > >>> > >>> > >>> 2013/9/14 Maynard Johnson <may...@us... <mailto:may...@us...> > >>> <mailto:may...@us... <mailto:may...@us...>> <mailto:may...@us... <mailto:may...@us...> > >>> <mailto:may...@us... <mailto:may...@us...>>>> > >>> > >>> On 09/13/2013 02:18 AM, benzhi cao wrote: > >>> > Hi, can Oprofile be used to profile performance of > >>> multi-processes programs ? And if it can, how to see the the performance > >>> of each process? (P.S: The online manual shows that it can be used to > >>> profile multi-threads programs, but I don't know whether it can be used > >>> for multi-processes). Any help will be appreciated, thanks a lot~ > >>> Hi, Emily, > >>> Hopefully, you're using oprofile 0.9.9 so you can use operf > >>> instead of the older "legacy" opcontrol commands. Using operf, you can > >>> specify to profile just the particular application (or process) you're > >>> interested in. If your application does fork/exec to create new child > >>> processes, operf will, by default, collect all sample data for the parent > >>> and children, but will aggregate all sample data. (ATTENTION: 0.9.9 has > >>> some key bug fixes for operf relating to following forked children.) You > >>> can specify "--separate-thread" (see operf's man page for details) so that > >>> samples are separated by process and thread. If you do collect a > >>> --separate-thread profile, be aware that opreport, being a text-based > >>> report generator does not handle too many axes of separation very well. > >>> You may get a report that looks like a jumbled mess, but would show a list > >>> of process IDs near the top of the report. You could use that list of > >>> PIDs to generate per-process reports -- e.g., > >>> 'opreport tgi! > >>> > >>> d:<pid! > >>> > >>> #> [option > >>> s]'. In some cases, opreport gives up and tells you that you > >>> have to either provide a profile specification (e.g., 'tgid:<pid#">' or, > >>> if profiling with multiple events, 'event:<event_name>'). More > >>> information on profile specifications can be found at > >>> http://oprofile.sourceforge.__net/doc/results.html#profile-__spec > >>> <http://oprofile.sourceforge.net/doc/results.html#profile-spec>. > >>> > >>> -Maynard > >>> > >>> > >>> > Best~ > >>> > Emily > >>> > > >>> > > >>> > > >>> > > >>> ------------------------------__------------------------------__------------------ > >>> > How ServiceNow helps IT people transform IT departments: > >>> > 1. Consolidate legacy IT systems to a single system of > >>> record for IT > >>> > 2. Standardize and globalize service processes across IT > >>> > 3. Implement zero-touch automation to replace manual, > >>> redundant tasks > >>> > > >>> http://pubads.g.doubleclick.__net/gampad/clk?id=51271111&iu=__/4140/ostg.clktrk > >>> <http://pubads.g.doubleclick.net/gampad/clk?id=51271111&iu=/4140/ostg.clktrk> > >>> > > >>> > > >>> > > >>> > _________________________________________________ > >>> > oprofile-list mailing list > >>> > oprofile-list@lists.__sourceforge.net <http://sourceforge.net> > >>> <mailto:opr...@li... <mailto:opr...@li...>> > >>> <mailto:oprofile-list@lists. <mailto:oprofile-list@lists.>__sourceforge.net <http://sourceforge.net> > >>> <mailto:opr...@li... <mailto:opr...@li...>>> > >>> > https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >>> > > >>> > >>> > >>> > >>> > >>> > >>> ------------------------------__------------------------------__------------------ > >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >>> SharePoint > >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack > >>> includes > >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > >>> > >>> http://pubads.g.doubleclick.__net/gampad/clk?id=58041151&iu=__/4140/ostg.clktrk > >>> <http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk> > >>> > >>> _________________________________________________ > >>> oprofile-list mailing list > >>> oprofile-list@lists.__sourceforge.net <http://sourceforge.net> > >>> <mailto:opr...@li... <mailto:opr...@li...>> > >>> https://lists.sourceforge.net/__lists/listinfo/oprofile-list > >>> <https://lists.sourceforge.net/lists/listinfo/oprofile-list> > >>> > >>> > >>> > >>> > >>> > >>> ------------------------------------------------------------------------------ > >>> LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! > >>> 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, > >>> SharePoint > >>> 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack > >>> includes > >>> Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. > >>> http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk > >>> > >>> > >>> > >>> _______________________________________________ > >>> oprofile-list mailing list > >>> opr...@li... <mailto:opr...@li...> > >>> https://lists.sourceforge.net/lists/listinfo/oprofile-list > >>> > >> > >> > > > > |