From: Jeff A. <ja...@fa...> - 2018-03-05 19:41:51
|
Bugs have come up over the years that relate to multiple Python interpreters and their state with threads (#2465, #2513, #2505, #2507. #2199). In circumstances where there is more than one interpreter and more than one JVM worker thread, properties that users hoped would be independently settable turn out not to be, or get mixed up. Mostly, these are things we keep in the sys module, aka PySystemState. We've claimed victory on this problem a few times (closed a few issues) but like a badly stretched carpet, the nail makes it wrinkle up somewhere else. This makes me suspect that something fundamental may be wrong. Apparently [1], it is too strong to say (C)Python is broken in this area of threads and interpreters. Yet CPython is certainly in some difficulty making the same carpet lie flat [2]. [1] https://mail.python.org/pipermail/python-ideas/2017-May/045770.html [2] https://docs.python.org/3/c-api/init.html#bugs-and-caveats (2nd paragraph) Do our problems stem from copying a flawed model? Is it just more painfully obvious in Jython because of the absence of the GIL, and the type of applications people build on the JVM? My understanding of the C API is that a PyThreadState points to the tip of the stack of active PyFrames and so at any moment where that thread is not actually running in a CPU, it holds all the state necessary to resume. Any reference in the resumed code to module-global state, e.g. imported modules, will be resolved to the values prevailing before suspension. In CPython a PyThreadState is paired with an OS thread, and in Jython with a JVM Thread, in such a way that any executing code that doesn't already have it in a local variable can look up the current PyThreadState, and hence all the stack and interpreter state. "Paired" in Jython means we use a ThreadLocal to get from Thread to state. In CPython it is the same with the addition of the GIL to police exclusive access to interpreter resources, when that thread becomes the unique current thread. Also, every PyThreadState references one PyInterpreterState (many-one) where the import mechanism to use is referenced, and the sys module (path, metapath), builtins and codec registry all sit, so a subsequent import statement will search the right places. Anything implicitly dependent on the interpreter state (such as print, through sys) will behave consistently. Or so we hope . The CPython implementation allows for multiple PyInterpreterState objects (sub-interpreters), each with its collection of PyThreadState objects. The C API [2] warns us that GIL manipulation combined with multiple sub-interpreters is delicate because of the assumption that one OS thread maps to at most one PyThreadState. The assumption is that the OS/JVM thread will lead us to the correct PyThreadState and so to the correct PyInterpreterState, but certain sequences of operations expose the assumption as unreliable. I argue that this is an incorrect basis for finding the right interpreter (as [2] effectively warns us). In a compelling case for Jython, a Java thread pool is shared by multiple sub-interpreters, each of which may have made changes to the module search path, builtins, available codecs or sys.std[in|out|err]. The pool has a queue of tasks, each waiting for any available JVM thread. An object representing a task was created in a particular sub-interpreter. If the task involves any code compiled from Python or "Python aware" Java, that code can only execute as expected in the context the programmer has created for it in the state of the interpreter. For example, if execution encounters an import statement, the search path should be as defined where the task was prepared. If there is a call to print(), and sys.stdout has been redefined in interpreter that originated the task, that's the sys.stdout that print() should find. When a task is given to a worker thread, it doesn't matter that some other sub-interpreter holds a PyThreadState that represented this thread in the past, or that none does. We need a way to map from a task to its defining sub-interpreter, and (probably) to create a PyThreadState in that sub-interpreter for the OS/JVM thread that happens to be running the task, at least before code is executed that references the thread and interpreter. Our only clue to this is the task object itself. Although this is a compelling case for the JVM, it may not seem so for CPython. However, the same surely applies wherever a C extension loses interpreter context for an object that transfers between threads. In fact, such loss of context happens all the time when interpreting CPython byte code and has to be found each time we create a PyFrame, for example, by a call to PyThreadState_GET(), but hardly ever has the thread changed. Even for the JVM we should not think this applies only for objects explicitly created to represent tasks: any call-back, in fact, any invocation of a Python-defined operation (like PyObject._add, when it leads to the dunder add method) needs the same consideration if objects are able to be handled by different threads. At first, it seems as if every PyObject must have a reference to its owning interpreter. I think this is not the case. Since it is only the execution of code that raises the issue, and a (Python) function or method seems always able to find the module it belongs to, I expect only module objects to need this reference. The places where we must resolve the interpreter are those where we currently pick up the thread or interpreter state by a call to runtime support. Subject to a deeper look, I think the slow path via the module is only needed where presently there is no ThreadState and we fall back on the "default interpreter". But in the absence of certainty, we shouldn't have to guess. This solution is controversial, I don't doubt. Jeff -- Jeff Allen |