|
From: Julian S. <js...@ac...> - 2012-06-17 11:21:10
|
On Friday, June 15, 2012, Eliot Moss wrote:
I think Philippe already commented on this, but to add my 2 euro-cents ..
> An approach I was thinking you might consider is:
>
> - When a thread needs to JIT code, it JITs into memory space that
> it "owns" for the purpose, i.e., there are per-thread JIT buffers,
> possibly acquired (from time to time, in large enough chunks to
> mitigate overhead of acquiring them) from a global pool. Such
> acquisition would have to be synchronized (i.e., use a lock or
> something) but would be infrequent, with most work happening
> into local buffers.
>
> - When a thread is done JITing a chunk, it installs it into a global
> lookup table atomically. It might be possible for two threads to
> try to JIT the same chink at the same time. They might each generate
> code then both try to install, with one "winning" and one "losing".
> The "loser" would simply discard its translation and use the other
> one.
>
> - Obviously this sometimes wastes work. If such waste is common, you
> could add an atomic operation that notes "thread xyz is working on
> translating this". Other threads would wait until xyz finishes.
This is a plausible scheme, as far as it goes. The drawback is that the
problem is more complex than merely JITting code, in 2 ways:
* at some arbitrary point after JITting (but often very soon), the new
code has to be patched ("translation chained") so it jumps directly to
downstream translations. So we want to avoid races during patching,
or perhaps use some glorified CAS scheme, since it's an atomic replacement
of a handful of insn bytes.
* even later on, code will need to be discarded. This happens when libraries
are unmapped, and in other situations. So the mapping tables then need to
be updated. Also, this is much more complex in the presence of chaining,
since we first need to un-chain any translations that jump to (iow, have
been chained to) the just about to be deleted translation(s).
> I've not dug around in the code much, but it wouldn't surprise me if
> significant data structures would have to become per-thread instead
> of global/static. This could be painful to "fix", requiring more
> arguments to calls, or lookups for per-thread data, etc.
Yeah, I also wouldn't be surprised to find that.
J
|