Re: [Psyco-devel] Generator support

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Chris King wrote:

> On Thu, 12 Aug 2004 17:41:52 +0100, Armin Rigo <ar...@tu...> wrote:
...

>>the state will be saved off and
>>restored to the C stack around a 'yield'.  In some sense it is better because
>>it avoids keeping a lot of small C stacks around (and Psyco wastes quite a lot
>>of C stack space, so multiplying it by the number of active generator
>>instances doesn't look like a good idea).
> 
> My initial idea was to save only what stack was used into a dynamic
> buffer, rather than using the multiple fixed stacks of ucontext.  I
> decided against this due to the overhead incurred by copying to/from a
> stack buffer, opting rather for the increased memory usage of
> ucontext.  The bytecode solution seems similar to this -- it would
> save memory, but waste processor time.  It seems then that the best
> compromise would be to keep multiple fixed Python stacks around, à la
> ucontext, and quickly switch between them, but that will have to wait
> for Stackless Psyco ;)

Yes and no:
We are after both solutions, since they both have advantages
which are simply different and orthogonal.

The stack switching approach has the advantage of a speed
not reachable by anything else -- on certain platforms !!

There are Stackless supported platforms, where no single
stack switch *ever* can be fast. Ever used a Spark?
You need to flush all the register windows and restore them.
This platform simply cannot efficiently switch contexts.

On X86 et al, you are right: Stack switching is *very* fast.
This speed comes for some cost as well:
Without huge efforts, pickling the program state is impossible.
And if you need very many generators, this approach doesn't lead
anywhere, since the cost of a stack is many kilobytes,
not measured is a few hundred bytes.
By very many I'm talking of 100.000 and much more.
If you need hundreds, this is of course trivial.

The bytecodehacks approach is more towards Stackless 3.0:
if we can avoid using the stack, we do it. This costs
a little computation time, not so much. We always leave the
stack clean.
This gives us a minimum of memory costs, while we pay for it
with a little computation time, of course. But this will be
optimum on Spark-like architectures, for instance.
It also gives us trivial thread pickling, since the state tuples
can be pickled, easily.

Another, *huge* advantage of the bytecode approach:
Here, we can make full use of inline techniques, which
I am about to publish. I have enhanced the bytecodehacks
to do real, completely compatible inlining of Python
functions in all situations.

The advantage of this is:
If you can inline a generator completely, the whole function call
melts away, together with the state store/restore sequence. Psyco
optimizes this away like a charm!
This is a thing that cannot be competed by any stack tricks.
Used to the right amount, this feature is unbeatable.

Finally, I want to combine both techniques:
Inlining upto a bearable limit of code bloat.
This creates huge, fast code blocks, with very
many local variables.
Switching these without inlining would be expensive
without inlining, because the state tuples get huge.
Exactly here it makes sense to switch to the stack
switching model.

This combination seems to be almost best possible of
what can be done at all, and I think we are very
near to a system with no competitors I know of.

cheers -- chris

-- 
Christian Tismer             :^)   <mailto:ti...@st...>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  mobile +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/