Re: [Psyco-devel] Generator support (was: Re: Psyco internals (fwd))

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Thu, 12 Aug 2004 17:41:52 +0100, Armin Rigo <ar...@tu...> wrote:

> The ucontext mechanism (which I didn't know
> about, thanks) appears not to be portable, though, and I don't think you =
can
> do it with setjmp/longjmp.

With some gcc trickery (__builtin_return_address,
__builtin_frame_address, and computed gotos), a setjmp/longjmp
implementation should be possible (though messy), but that still
restricts it to gcc.

> There is even an off-product of Stackless called
> Greenlets that is essentially a Python-oriented ucontext implementation.
> (Both Stackless and Greenlets use a few lines of assembly code to hack th=
e
> stack.)

Quite interesting, that!  Of course, using assembly will restrict to a
specific architecture, but would make rewriting for different
architectures easier.

> the state will be saved off and
> restored to the C stack around a 'yield'.  In some sense it is better bec=
ause
> it avoids keeping a lot of small C stacks around (and Psyco wastes quite =
a lot
> of C stack space, so multiplying it by the number of active generator
> instances doesn't look like a good idea).

My initial idea was to save only what stack was used into a dynamic
buffer, rather than using the multiple fixed stacks of ucontext.  I
decided against this due to the overhead incurred by copying to/from a
stack buffer, opting rather for the increased memory usage of
ucontext.  The bytecode solution seems similar to this -- it would
save memory, but waste processor time.  It seems then that the best
compromise would be to keep multiple fixed Python stacks around, =E0 la
ucontext, and quickly switch between them, but that will have to wait
for Stackless Psyco ;)

Just to throw in an example use case, the program I'm working on right
now utilizes dozens (possibly even hundreds) of generators, stored in
a heap, which must all be created and (partially) iterated through in
around 1/10 of a second (for UI purposes).  It runs fine on faster
processors, but <500MHz it could use a Psyco boost.  I imagine memory
usage for a ucontext-based implementation would be incredible in this
case, but then, time spent copying state to/from smaller buffers might
overshadow the actual time spent inside the generators themselves
(which is negligible).

Curious, which part of Psyco eats up stack space, the
code-generating/profiling routines, or the generated code itself?=20
Because if it's the former, a ucontext-style implementation would
likely be best, if the two stacks could be seperated (no doubt a
daunting task, though).

Just some thoughts.