From: Simon R. <si...@gm...> - 2010-12-30 04:21:20
|
Sorry, I should have proofread the entire thing before sending it. On Wed, Dec 29, 2010 at 11:16 PM, Simon Ruggier <si...@gm...> wrote: > Earlier this year, I was trying to debug a performance problem in a > video playback application using oprofile. The software was capable of > playing at acceptable framerates most of the time, but there were > periods where it would freeze for a few seconds at a time. I tried to > diagnose this using callgraph support, etc., but it was hard to > identify the problem. I eventually figured it out by collecting > samples during the problem time period, and then using opreport's > differential output to compare with samples collected during normal > playback. This allowed me to find that the culprit was a somewhat > unrelated application that was occasionally starving the video > playback out of CPU time. This was not obvious without trying to focus > on the specific period of time that the problem occurred in, or even > without using the differential output, because culprit's samples were > blended in with everything else. > > It would be nice if it were possible to solve this use case (surely > I'm not the analysis using just one session of sample collection. This This sentence should read: It would be nice if it were possible to solve this use case using just one session of sample collection (surely I'm not the only person with this use case). > leads me to think that it would be useful to store timing information > in the samples. Storing them on a per-sample basis would obviously be > overkill, but it seems to me like storing them with each context > switch would provide very fine granularity without excessively > bloating the size of sample data. From reading the internals document, > it seems like it wouldn't be necessary to modify any > architecture-specific code to make this work, but the sample storage > format in the event buffer and maybe the cpu buffers would need an > extra field to be added to the events to store the time. > > Obviously this information wouldn't be helpful without a good > reporting format, but I think it could be very helpful if combined > with a GUI that performs either histogramming or windowing ([1]) with > controllable granularity, and displays symbols proportionately in a > format reminiscent of the line chart in [2], but with no intersecting > lines, of course. The important purpose of the visual display would be > to highlight variations over time so that the user can focus on > specific time intervals. The actual symbol names could be shown in a > tooltip, and maybe the user could generate textual reports as with > opreport, but for a graphically selected interval or intervals of > interest. I think a tool like this would be useful for the use case I > describe above, but also useful for distinguishing between separate > steps in algorithms that perform multiple CPU intensive tasks in > order. > > I don't expect anyone to step up and implement all of this, but I may > not have time to implement it in the near future because of all of the > conflicting obligations I have from school. Please let me know if > there's some existing way to deal with these use cases that I've > missed, or if there is disagreement about changing the way samples are > collected in the kernel. Any other discussion of this idea is also > welcome. > > Thanks, > Simon > > [1] http://en.wikipedia.org/wiki/Kernel_density_estimation > [2] http://xkcd.com/657/ > |