From: Kjetil J. <kje...@cs...> - 2001-12-14 13:07:06
|
hi! psyco allows you to explicitly generate code for an existing python function through the use of proxy functions. however, not every application will benefit from dynamic code generation. for instance, in applications where there are few function calls, the overhead of dynamically generating code may be outweigh the benefits, as the cost of generating code is not zero. hence, we would like psyco to dynamically generate code for functions that are executed frequently. one way of achieving this in vanilla python is to use the setprofile function in the sys module. setprofile is given ref to a function f. f is subsequently invoked at every function call, return and exception. the function f may include logic to count the number of invocations of a particular function, rebind functions to a psyco proxy if applicable and so on. however, setprofile is no free lunch. so, i've done some simple tests to get a picture of the performance overhead. i have used the pystone benchmark as it performs lots of function calls and is thus fairly useable for determining the overhead of setprofile. the first test uses a "void" like profile function: def f(frame, event, arg): pass the measured overhead of using sys.setprofile _with_ and _without_ a hook to f (numbers are the number of pystones): pystone.py with sys.setprofile(f): 2178 pystone.py without sys.setprofile(f): 5582 here, pystone executes rougly 2.6 times faster when _not_ using the profiler function f. now, once f does something meaningful, like tracking the number of invocations to each function, the performance is expected to decrease further. to get a better picture of such a scenario, i replaced function f with function g: funcs = {} def g(frame, event, arg): if event != 'call': return fn = frame.f_code.co_name if funcs.has_key(fn): funcs[fn] += 1 else: funcs[fn] = 1 it's a fairly simple function that logs the number of invocations per function in a dictionary 'funcs'. with function g, the timings turned into: pystone.py with sys.setprofile(g): 1457 pystone with function g is now 3.8 times slower than pystone without g. in addition, a real jit would require the complexity of deciding when to create a psyco proxy for a function, and do the actual function rebinding from the old function to the proxy function. needless to say, psyco must do a pretty good job optimizing functions to justify the overhead of using sys.setprofile. so the next logical step would be to include some more logic for rebinding functions dynamically to see whether the payoff of doing rebinding within a setprofile function can profit performance. to do so, i've made a simple module called psyco.py and a special version of the pystone benchmark called pystone-jit.py that utilizes the psyco module. both files can be found in the test/ directory in psyco-cvs. pystone-jit with psyco module (first): 1790 pystone-jit with psyco module (second): 6410 (pystone-jit executes the benchmark twice to take advantage of function rebinding). the first iteration is pretty slow, only 22% faster than pystone.py with function g and 3.1 times slower than without any setprofile function. this is because not all functions getting rebound on the first run. the next iteration is pretty fast, actually 10% faster than running pystone without setprofile functions. however, note that the psyco module is still very simple and only rebinds functions it finds in the global namespace. also, if functions are called only once (where the rebinding will have no effect), the overhead with setprofile is significant and the application will run slower with psyco than without psyco. so it seems that we should investigate alternatives to the setprofile function. in particular we might not need the profile function to be invoked for _every_ function call. maybe invocation at every 10th or 100th function call to the same function is sufficient to determine whether a function should be rebound to a proxy function. i'd like to hack a bit on ceval.c to allow having the setprofile functions only be invoked at particular intervals instead of at every function call. so far, i've used python code for the setprofile functions. i'll also investigate whether using a c-function instead will boost performance significantly. i guess we should also have a look at the java jits and the hotspot stuff... regards, - kjetil |