From: Christian T. <ti...@st...> - 2004-08-13 00:03:37
|
Chris King wrote: > On Thu, 12 Aug 2004 17:41:52 +0100, Armin Rigo <ar...@tu...> wrote: ... >>the state will be saved off and >>restored to the C stack around a 'yield'. In some sense it is better because >>it avoids keeping a lot of small C stacks around (and Psyco wastes quite a lot >>of C stack space, so multiplying it by the number of active generator >>instances doesn't look like a good idea). > > My initial idea was to save only what stack was used into a dynamic > buffer, rather than using the multiple fixed stacks of ucontext. I > decided against this due to the overhead incurred by copying to/from a > stack buffer, opting rather for the increased memory usage of > ucontext. The bytecode solution seems similar to this -- it would > save memory, but waste processor time. It seems then that the best > compromise would be to keep multiple fixed Python stacks around, à la > ucontext, and quickly switch between them, but that will have to wait > for Stackless Psyco ;) Yes and no: We are after both solutions, since they both have advantages which are simply different and orthogonal. The stack switching approach has the advantage of a speed not reachable by anything else -- on certain platforms !! There are Stackless supported platforms, where no single stack switch *ever* can be fast. Ever used a Spark? You need to flush all the register windows and restore them. This platform simply cannot efficiently switch contexts. On X86 et al, you are right: Stack switching is *very* fast. This speed comes for some cost as well: Without huge efforts, pickling the program state is impossible. And if you need very many generators, this approach doesn't lead anywhere, since the cost of a stack is many kilobytes, not measured is a few hundred bytes. By very many I'm talking of 100.000 and much more. If you need hundreds, this is of course trivial. The bytecodehacks approach is more towards Stackless 3.0: if we can avoid using the stack, we do it. This costs a little computation time, not so much. We always leave the stack clean. This gives us a minimum of memory costs, while we pay for it with a little computation time, of course. But this will be optimum on Spark-like architectures, for instance. It also gives us trivial thread pickling, since the state tuples can be pickled, easily. Another, *huge* advantage of the bytecode approach: Here, we can make full use of inline techniques, which I am about to publish. I have enhanced the bytecodehacks to do real, completely compatible inlining of Python functions in all situations. The advantage of this is: If you can inline a generator completely, the whole function call melts away, together with the state store/restore sequence. Psyco optimizes this away like a charm! This is a thing that cannot be competed by any stack tricks. Used to the right amount, this feature is unbeatable. Finally, I want to combine both techniques: Inlining upto a bearable limit of code bloat. This creates huge, fast code blocks, with very many local variables. Switching these without inlining would be expensive without inlining, because the state tuples get huge. Exactly here it makes sense to switch to the stack switching model. This combination seems to be almost best possible of what can be done at all, and I think we are very near to a system with no competitors I know of. cheers -- chris -- Christian Tismer :^) <mailto:ti...@st...> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 mobile +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ |