|
From: Liu J. <ice...@gm...> - 2013-04-24 03:44:27
|
Dear all, I used Valgrind(callgrind) to profile database cache performance, and the host machine is a VM guest with XEON CPU. Without instrumentation, the performance of dbms with cache-on is 20% higher than cache-off. With callgrind, the call graph shows there is a 5x performance improvement. My point is whether Valgrind is suitable for profiling working in VM? e.g, with IR conversion, the performance gap becomes larger, which does not reflect the real gap. If you are familiar with profiling, I appreciate you could recommend some other profile tools. Thanks a lot!!! Best regards, James |
|
From: John R. <jr...@bi...> - 2013-04-24 03:56:51
|
> I appreciate you could recommend some other profile tools. Get a recent AMD CPU and investigate this work: [Valgrind-developers] Believed to be complete: LWP support for Valgrind (feature request 317441) Rick Gorton <rcgorton verizon net> 04/07/2013 -- |
|
From: Josef W. <Jos...@gm...> - 2013-04-24 08:42:45
|
Am 24.04.2013 05:44, schrieb Liu James: > Dear all, > > I used Valgrind(callgrind) to profile database cache performance, and > the host machine is a VM guest with XEON CPU. > > Without instrumentation, the performance of dbms with cache-on is 20% > higher What do you mean by higher? Better or worse performance? than cache-off. With callgrind, the call graph shows there is a > 5x performance improvement. For Callgrid/Cachegrind, the environment doesn't matter as it's a simulation of a simple cache hierarchy, only takeing user level into account. So it does not matter if it's running within a VM guest. BUT: I would assume that user-level simulation can not really capture what's going on in a database, as there probably is much I/O and the OS involved, and more important, the performance probably depends on external latency/bandwidth contrains from network/drives etc. > My point is whether Valgrind is suitable for profiling working in VM? > e.g, with IR conversion, the performance gap becomes larger, which > does not reflect the real gap. If you are familiar with profiling, I > appreciate you could recommend some other profile tools. Any profiler which does system-wide sampling should be fine (perf, oprofile, other Intel/AMD tools, ...). But the VM must support to pass performance counters to guests, if that tool should work in the guest. I think that perf running within a KVM guest should be able to do this. Interpreting the results is another issue. An idea would be to check how much the performance depends on external constrains. A VM should allow to throttle e.g. the network connection. On the other hand - as John suggested - why not get rid of the VM for profiling first? More tools should work, and you also can compare results with/without VM afterwards. The VM may play a role if a lot of TLB misses are involved - even with hardware support (extended/nested page tables) a TLB miss may be way slower in a VM guest. Josef > > > Thanks a lot!!! > > > Best regards, > James > > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |