From: William C. <wc...@re...> - 2020-07-28 21:16:31
|
On 7/24/20 11:47 AM, J Lumby wrote: > On 7/21/20 5:47 PM, J Lumby wrote: >> >> On 7/21/20 9:48 AM, William Cohen wrote: >>> >>> Could you give that new release of oprofile a try? >>> >>> -Will >> >> >> I saw 1.4.0 available almost immediately after I posted that. I've now tried the same run on 1.4.0 (compiled on the target machine just to be sure it compiles with the same bfd headers and libs) and there are mixed results. >> >> It is still losing 80% of all userspace events : see below; >> >> > I turned on --verbose="debug,convert" and from that, discovered the explanation for the very high loss. > > My workload was forking a large number of processes in sequence, each of which did a certain amount of work (typically around 5 seconds-worth on a single CPU) and then exited. I guess operf's handling of the mapping events takes something of the same order of time to understand each process, by which time it has gone. One though is how exactly are you starting doing the profiling? Like the the following: operf <command_to_profile> Or attaching to a running process? operf --pid <pid> Or doing systemwide monitoring with --system-wide? You might check to see if the linux perf command has a similar problem with the quick spawn and death of processes. operf is using the same mechanism in the kernel to collect performance event samples. There are some cases where the scanning of /proc can get behind the rapid creation and death of processes. It would be useful to know if the problem lies with oprofile or is also seen in perf. -Will > > I changed the workload to do all the work in a single continuous process and now it works well : > > Profiling started at Fri Jul 24 11:15:41 2020 > Profiling stopped at Fri Jul 24 11:17:07 2020 > > -- OProfile/operf Statistics -- > Nr. non-backtrace samples: 12791 > Nr. kernel samples: 2248 > Nr. user space samples: 10543 > Nr. samples lost due to sample address not in expected range for domain: 0 > Nr. lost kernel samples: 0 > Nr. samples lost due to sample file open failure: 0 > Nr. samples lost due to no permanent mapping: 34 > Nr. user context kernel samples lost due to no app info available: 89 > Nr. user samples lost due to no app info available: 0 > Nr. backtraces skipped due to no file mapping: 34 > Nr. hypervisor samples dropped due to address out-of-range: 0 > Nr. samples lost reported by perf_events kernel: 0 > |