Hello there,
my name is Kay Hayen, I am the author of the Python compiler Nuitka
(http://nuitka.net) and so far I have very successfully used Valgrind to
analyse the performance of my created binaries in comparison to original
CPython bytecode interpretation.
However, I am kind of set now, on creating a comparison tool. The user
should be allowed to run his code with Nuitka and CPython, and it should
be possible to compare the performance. I sort of promised this to my
users during Europython this week.
For CPython, and partially Nuitka, where bytecode becomes evaluated, and
effectively always in a Python frame, function or module, this is
basically is just PyEval_CallFrame (or similar) on the C call stack, and
as such totally impossible to associate to decode without looking at its
arguments.
(For Nuitka, there also may be possibly inlining, which will not have a
call made, but instead a debugging information change only. But lets
ignore that, I probably need to solve that when presenting the data.)
I am therefore considering an improvement to callgrind, that would try
to make it emit the Python function name, and the Python source file
instead. Do you think that is possible, and can you give me pointers to
inside the valgrind source distribution, what to change, where to hook?
Looking at callstack.c in callgrind directory, would it be possible to
make direct accesses to the running binary to capture stuff from its
data, from function arguments, or is that not possible?
Should this be possible, it would enable profiling of Python code with
valgrind that is accurate for even single iterations as it's based on
ticks. So please help me there, if you can.
Yours,
Kay Hayen
|