From: Arthur Y. <ar...@ma...> - 2011-09-11 16:53:13
|
Hi, dear experts! I am using oprofile for cluster wide profiling of an x86_64 HPC cluster with dual-processor nodes (2 x Xeon E5345 @ 2.33GHz) running RHEL 4.8 with 2.6.9-55.ELsmp kernel and oprofile 0.8.1-36 from RHEL distro. I've implemented simple oprofile + ganglia monitoring system integration which allows to monitor value of some metric (BUS_TRAN_MEM in my case) got from oprofile through gangia's web frontend on every node. For this purpose I set startup of oprofile at node start: sudo /usr/bin/opcontrol --start --event=BUS_TRAN_MEM:10000 --no-vmlinux and periodical (once for minute) run of opreport for node wide profiling opreport -l -m all -r -t 1 which allows to count metric change during past minute and send information to ganglia. And that is working fine without any visible overhead. Recently I've tried to set up profiling with differentiation by one-node jobs running on the cluster. For this purpose I set startup of oprofile at node start with profiling separately by thread: sudo /usr/bin/opcontrol --start --event=BUS_TRAN_MEM:10000 --no-vmlinux --separate-thread and once after job finish run of opreport opreport -t 1 -r tgid:${JOB_TGID} --merge=tgid which allowed to get job's "usage of a metric" during execution. And that is working fast and perfectly. But as a consequence I faced significant slowdown of execution of opreport used for node wide profiling as described earlier (opreport -l -m all -r -t 1). Specifically, just after start of oprofile daemon, time of execution of opreport is less than a second but after day of running profiling daemon, time of execution becomes close to one minute which creates sensible overhead on cluster nodes. Probably it is concerned with increasing of profiling information volume in connection with --separate-thread profiling. But could somebody give a hint on any workarounds which help in decreasing opreport execution time in my case. Best regards, Arthur |
From: Maynard J. <may...@us...> - 2011-09-14 00:56:50
|
Arthur Yuldashev wrote: > Hi, dear experts! > > I am using oprofile for cluster wide profiling of an x86_64 HPC cluster > with dual-processor nodes (2 x Xeon E5345 @ 2.33GHz) > running RHEL 4.8 with 2.6.9-55.ELsmp kernel and oprofile 0.8.1-36 from > RHEL distro. > > I've implemented simple oprofile + ganglia monitoring system integration > which allows to monitor value of some metric > (BUS_TRAN_MEM in my case) got from oprofile through gangia's web > frontend on every node. > For this purpose I set startup of oprofile at node start: > sudo /usr/bin/opcontrol --start --event=BUS_TRAN_MEM:10000 --no-vmlinux > and periodical (once for minute) run of opreport for node wide profiling > opreport -l -m all -r -t 1 > which allows to count metric change during past minute and send > information to ganglia. > > And that is working fine without any visible overhead. > > Recently I've tried to set up profiling with differentiation by one-node > jobs running on the cluster. > For this purpose I set startup of oprofile at node start with profiling > separately by thread: > sudo /usr/bin/opcontrol --start --event=BUS_TRAN_MEM:10000 --no-vmlinux > --separate-thread > and once after job finish run of opreport > opreport -t 1 -r tgid:${JOB_TGID} --merge=tgid > which allowed to get job's "usage of a metric" during execution. > > And that is working fast and perfectly. > > But as a consequence I faced significant slowdown of execution of > opreport used for node wide profiling as described earlier (opreport -l > -m all -r -t 1). > Specifically, just after start of oprofile daemon, time of execution of > opreport is less than a second but after day of running profiling daemon, > time of execution becomes close to one minute which creates sensible > overhead on cluster nodes. > > Probably it is concerned with increasing of profiling information volume > in connection with --separate-thread profiling. > But could somebody give a hint on any workarounds which help in > decreasing opreport execution time in my case. Arthur, Firstly, I am assuming you never do 'opcontrol --reset'. Is that correct? Is this intentional? Do you really want accumulated profile data? If not, then execute the reset after every opreport and that will resolve your slowdown issue. But if you *do* want the accumulated data, read on. When using oprofile without any separation parameters (e.g, --separate=thread) and just one event, the sample data is stored in one file per sampled binary. So if your system generally has the same applications running all the time, the number of sample data files would not increase except when a not-seen-before application is executed. But with --separate=thread, as new processes are started (even if they're running the same app that earlier processes ran), you'll get a new sample file for each process. I hope that helped. -Maynard > > > Best regards, > Arthur > > ------------------------------------------------------------------------------ > Using storage to extend the benefits of virtualization and iSCSI > Virtualization increases hardware utilization and delivers a new level of > agility. Learn what those decisions are and how to modernize your storage > and backup environments for virtualization. > http://www.accelacomm.com/jaw/sfnl/114/51434361/ > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list |