From: J L. <joh...@ho...> - 2020-07-21 01:03:01
|
Oprofiling an intensive postgresql workload running on a Lenovo P52 and lniux kernel 5.7.4, I see very high numbers of both of these : approximately two-thirds of all user-space samples are being lost. see operf.log below The operf command I've used is operf --vmlinux /mnt/bullbild/linux-5.7.4-bullarch/vmlinux --session-dir /tmp/oprof_session.200720-195024 --events cpu_clk_unhalted:30000045:thread:1:1 --callgraph --separate-cpu --system-wide and it works but tells me WARNING: Lost samples detected! See /tmp/oprof_session.200720-195024/samples/operf.log for details. Lowering the sampling rate as suggested does not help : I tried with a count of 99999999 and same high lost samples. I assume oprofile somehow is unable to map the instruction pointer of the event to any process? How does it do this? Is there any setting or control or hint that I can set to help it? or , failing that, is there any way I can ask opreport to give me the results as raw instruction addresses? Help! Cheers, John Lumby ___________________________________________________________ Profiling started at Mon Jul 20 19:50:24 2020 Profiling stopped at Mon Jul 20 19:51:25 2020 -- OProfile/operf Statistics -- Nr. non-backtrace samples: 9435 Nr. kernel samples: 1924 Nr. user space samples: 7511 Nr. samples lost due to sample address not in expected range for domain: 0 Nr. lost kernel samples: 0 Nr. samples lost due to sample file open failure: 0 Nr. samples lost due to no permanent mapping: 4967 Nr. user context kernel samples lost due to no app info available: 0 Nr. user samples lost due to no app info available: 0 Nr. backtraces skipped due to no file mapping: 4967 Nr. hypervisor samples dropped due to address out-of-range: 0 Nr. samples lost reported by perf_events kernel: 0 |
From: William C. <wc...@re...> - 2020-07-21 13:49:01
|
On 7/20/20 9:03 PM, J Lumby wrote: > Oprofiling an intensive postgresql workload running on a Lenovo P52 and lniux kernel 5.7.4, I see very high numbers of both of these : approximately two-thirds of all user-space samples are being lost. see operf.log below > > > The operf command I've used is > > operf --vmlinux /mnt/bullbild/linux-5.7.4-bullarch/vmlinux --session-dir /tmp/oprof_session.200720-195024 --events cpu_clk_unhalted:30000045:thread:1:1 --callgraph --separate-cpu --system-wide > > > and it works but tells me > > WARNING: Lost samples detected! See /tmp/oprof_session.200720-195024/samples/operf.log for details. > > > Lowering the sampling rate as suggested does not help : I tried with a count of 99999999 and same high lost samples. > > > I assume oprofile somehow is unable to map the instruction pointer of the event to any process? How does it do this? Is there any setting or control or hint that I can set to help it? or , failing that, is there any way I can ask opreport to give me the results as raw instruction addresses? > > > Help! Hi, Which version of oprofile was being used? I just did a release of oprofile-1.4.0 it has commit [a3742f]: Use the mmap offset to correctly compute the IP location in a file Newer versions of binutils are now now splitting the file into multiple mmap loads. The assumption that the mmap for the executable code in the file starts at the beginning of the file (that the file offset is always zero) is no longer true. The computation to convert the IP address into an offset needs to also use the offset. I wonder if maybe that a number of the samples might be discarded because they are not being mapped to reasonable locations in executables. Could you give that new release of oprofile a try? -Will > > > Cheers, John Lumby > > ___________________________________________________________ > > Profiling started at Mon Jul 20 19:50:24 2020 > Profiling stopped at Mon Jul 20 19:51:25 2020 > > -- OProfile/operf Statistics -- > Nr. non-backtrace samples: 9435 > Nr. kernel samples: 1924 > Nr. user space samples: 7511 > Nr. samples lost due to sample address not in expected range for domain: 0 > Nr. lost kernel samples: 0 > Nr. samples lost due to sample file open failure: 0 > Nr. samples lost due to no permanent mapping: 4967 > Nr. user context kernel samples lost due to no app info available: 0 > Nr. user samples lost due to no app info available: 0 > Nr. backtraces skipped due to no file mapping: 4967 > Nr. hypervisor samples dropped due to address out-of-range: 0 > Nr. samples lost reported by perf_events kernel: 0 > > > > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list |
From: J L. <joh...@ho...> - 2020-07-21 21:47:23
|
On 7/21/20 9:48 AM, William Cohen wrote: > On 7/20/20 9:03 PM, J Lumby wrote: >> Oprofiling an intensive postgresql workload running on a Lenovo P52 and lniux kernel 5.7.4, I see very high numbers of both of these : approximately two-thirds of all user-space samples are being lost. see operf.log below >> >> > Hi, > > Which version of oprofile was being used? I just did a release of oprofile-1.4.0 it has commit [a3742f]: That run was on 1.3.0 > > Use the mmap offset to correctly compute the IP location in a file > > Newer versions of binutils are now now splitting the file into > multiple mmap loads. The assumption that the mmap for the executable > code in the file starts at the beginning of the file (that the file > offset is always zero) is no longer true. The computation to convert > the IP address into an offset needs to also use the offset. > > I wonder if maybe that a number of the samples might be discarded because they are not being mapped to reasonable locations in executables. > > Could you give that new release of oprofile a try? > > -Will I saw 1.4.0 available almost immediately after I posted that. I've now tried the same run on 1.4.0 (compiled on the target machine just to be sure it compiles with the same bfd headers and libs) and there are mixed results. It is still losing 80% of all userspace events : see below; but also the reports now show a completely different and much more "what-I-would-expect" list of tickcounts by function, so maybe I can make do with this release. But this improvement makes me wonder if maybe the fix is in the right direction but not quite there yet. Is that possible? Would switching on the -V "all' debugging help narrow down what is happening? Also if you can point me to the code changes I can take a look and maybe spot something. Cheers, John ___________________________________________________________ on 1.4.0 operf --vmlinux /mnt/bullbild/linux-5.7.4-bullarch/vmlinux --session-dir /tmp/oprof_session.200721-173111 --events cpu_clk_unhalted:50000000:thread:1:1 --callgraph --separate-cpu --system-wide Profiling started at Tue Jul 21 17:31:11 2020 Profiling stopped at Tue Jul 21 17:33:30 2020 -- OProfile/operf Statistics -- Nr. non-backtrace samples: 12423 Nr. kernel samples: 2170 Nr. user space samples: 10253 Nr. samples lost due to sample address not in expected range for domain: 0 Nr. lost kernel samples: 0 Nr. samples lost due to sample file open failure: 0 Nr. samples lost due to no permanent mapping: 8031 Nr. user context kernel samples lost due to no app info available: 0 Nr. user samples lost due to no app info available: 0 Nr. backtraces skipped due to no file mapping: 8029 Nr. hypervisor samples dropped due to address out-of-range: 0 Nr. samples lost reported by perf_events kernel: 0 |
From: J L. <joh...@ho...> - 2020-07-24 15:47:36
|
On 7/21/20 5:47 PM, J Lumby wrote: > > On 7/21/20 9:48 AM, William Cohen wrote: >> >> Could you give that new release of oprofile a try? >> >> -Will > > > I saw 1.4.0 available almost immediately after I posted that. I've now > tried the same run on 1.4.0 (compiled on the target machine just to be > sure it compiles with the same bfd headers and libs) and there are > mixed results. > > It is still losing 80% of all userspace events : see below; > > I turned on --verbose="debug,convert" and from that, discovered the explanation for the very high loss. My workload was forking a large number of processes in sequence, each of which did a certain amount of work (typically around 5 seconds-worth on a single CPU) and then exited. I guess operf's handling of the mapping events takes something of the same order of time to understand each process, by which time it has gone. I changed the workload to do all the work in a single continuous process and now it works well : Profiling started at Fri Jul 24 11:15:41 2020 Profiling stopped at Fri Jul 24 11:17:07 2020 -- OProfile/operf Statistics -- Nr. non-backtrace samples: 12791 Nr. kernel samples: 2248 Nr. user space samples: 10543 Nr. samples lost due to sample address not in expected range for domain: 0 Nr. lost kernel samples: 0 Nr. samples lost due to sample file open failure: 0 Nr. samples lost due to no permanent mapping: 34 Nr. user context kernel samples lost due to no app info available: 89 Nr. user samples lost due to no app info available: 0 Nr. backtraces skipped due to no file mapping: 34 Nr. hypervisor samples dropped due to address out-of-range: 0 Nr. samples lost reported by perf_events kernel: 0 |
From: William C. <wc...@re...> - 2020-07-28 21:16:31
|
On 7/24/20 11:47 AM, J Lumby wrote: > On 7/21/20 5:47 PM, J Lumby wrote: >> >> On 7/21/20 9:48 AM, William Cohen wrote: >>> >>> Could you give that new release of oprofile a try? >>> >>> -Will >> >> >> I saw 1.4.0 available almost immediately after I posted that. I've now tried the same run on 1.4.0 (compiled on the target machine just to be sure it compiles with the same bfd headers and libs) and there are mixed results. >> >> It is still losing 80% of all userspace events : see below; >> >> > I turned on --verbose="debug,convert" and from that, discovered the explanation for the very high loss. > > My workload was forking a large number of processes in sequence, each of which did a certain amount of work (typically around 5 seconds-worth on a single CPU) and then exited. I guess operf's handling of the mapping events takes something of the same order of time to understand each process, by which time it has gone. One though is how exactly are you starting doing the profiling? Like the the following: operf <command_to_profile> Or attaching to a running process? operf --pid <pid> Or doing systemwide monitoring with --system-wide? You might check to see if the linux perf command has a similar problem with the quick spawn and death of processes. operf is using the same mechanism in the kernel to collect performance event samples. There are some cases where the scanning of /proc can get behind the rapid creation and death of processes. It would be useful to know if the problem lies with oprofile or is also seen in perf. -Will > > I changed the workload to do all the work in a single continuous process and now it works well : > > Profiling started at Fri Jul 24 11:15:41 2020 > Profiling stopped at Fri Jul 24 11:17:07 2020 > > -- OProfile/operf Statistics -- > Nr. non-backtrace samples: 12791 > Nr. kernel samples: 2248 > Nr. user space samples: 10543 > Nr. samples lost due to sample address not in expected range for domain: 0 > Nr. lost kernel samples: 0 > Nr. samples lost due to sample file open failure: 0 > Nr. samples lost due to no permanent mapping: 34 > Nr. user context kernel samples lost due to no app info available: 89 > Nr. user samples lost due to no app info available: 0 > Nr. backtraces skipped due to no file mapping: 34 > Nr. hypervisor samples dropped due to address out-of-range: 0 > Nr. samples lost reported by perf_events kernel: 0 > |
From: John L. <joh...@ho...> - 2020-07-31 15:32:39
|
On 07/28/20 17:00, William Cohen wrote: > > One though is how exactly are you starting doing the profiling? Like the the following: > > operf <command_to_profile> > > Or attaching to a running process? > > operf --pid <pid> > > Or doing systemwide monitoring with --system-wide? Thanks William, my operf command is operf --vmlinux /mnt/bullbild/linux-5.7.4-bullarch/vmlinux --session-dir=/tmp/oprof_session.200724_120854 --callgraph --separate-cpu --system-wide so yes, system-wide. > > You might check to see if the linux perf command has a similar problem with the quick spawn and death of processes. > operf is using the same mechanism in the kernel to collect performance event samples. There are some cases where the scanning of /proc can get behind the rapid creation and death of processes. It would be useful to know if the problem lies with oprofile or is also seen in perf. I will try with perf some time but have never used that before (did not know about it) so may be a little while getting to it. Cheers, John |