Re: [pyOpenSSL] Python core dumps when using pyOpenSSL 0.8.0 and threads

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Wed, 25 Mar 2009 00:24:29 +0100, "M.-A. Lemburg" <ma...@eg...> wrote:
>On 2009-03-22 17:32, Jean-Paul Calderone wrote:
>> Hello all,
>>
>> I just pushed a branch to launchpad which may address the 0.8 thread-related
>> crashes.
>>
>> The bug, along with a reproduction script, is described here:
>>
>>   https://bugs.launchpad.net/pyopenssl/+bug/344815
>>
>> The branch, along with checkout instructions, which fixes the problem is here:
>>
>>   https://code.launchpad.net/~exarkun/pyopenssl/thread-crash
>>
>> If you're not in to bzr, then you can get just the revision which includes
>> the fix here:
>>
>>   http://bazaar.launchpad.net/~exarkun/pyopenssl/thread-crash/revision/98
>

Hi Marc,

Thanks for looking into this.  Sorry for the delayed response, PyCon has
been keeping me busy.

>This doesn't appear to cover all cases, esp. not if you have an
>application that uses recursion in callbacks.
>
>The patch causes an already existing thread state to get overwritten
>by another one. If the application starts returning from a deeper
>nesting, this will cause a wrong thread state to get restored (actually,
>the last one will get restored multiple times).

I think this will work out okay.  The thread state is actually a TLS object
anyway.  Any given thread can only have one thread state object.  So
PyEval_SaveThread will always return the same PyInterpreterState*.  At
least, I think this is how it happens.

Going into a bit more detail...

Say someone calls Connection.send.  MY_BEGIN_ALLOW_THREADS is invoked and
PyEval_SaveThread swaps NULL in as the new PyInterpreterState and gives
back the existing PyInterpreterState (call it S1).  This gets saved with
pyOpenSSL's _pyOpenSSL_tstate_key.  Any key there already is deleted.

  Now say one of the pyOpenSSL global callbacks gets invoked.  It starts by
  using MY_END_ALLOW_THREADS.  This loads S1 from _pyOpenSSL_tstate_key and
  passes it to PyEval_RestoreThread which swaps S1 in as the new
  PyInterpreterState.  Then it calls into some arbitrary Python code.

    Now say this arbitrary Python code calls another Connection's send
    method.  MY_BEGIN_ALLOW_THREADS is invoked again.  This deletes the
    current value of _pyOpenSSL_tstate_key and invokes PyEval_SaveThread
    again which swaps in NULL as the new PyInterpreterState and then saves
    the existing PyInterpreterState in _pyOpenSSL_tstate_key.  Since the
    previous step saved S1 as the PyInterpreterState, this is S1 again.

    Eventually the send finishes, MY_END_ALLOW_THREADS is invoked.  This
    the value of _pyOpenSSL_tstate_key which is S1, as described in the
    previous step.  S1 is passed to PyEval_RestoreThread and then the send
    call is done.

  Now control returns to the global callback code which finishes up and
  gets ready to return back to OpenSSL.  It does this by invoking
  MY_BEGIN_ALLOW_THREADS which clears _pyOpenSSL_tstate_key and then saves
  the result of PyEval_SaveThread (S1) in it.  Then control returns to
  OpenSSL.

Now the very top Connection.send finishes and the implementation of that
method in pyOpenSSL gets ready to return to the calling Python code.  It
invokes MY_END_ALLOW_THREADS which loads S1 from _pyOpenSSL_tstate_key and
passes it to PyEval_RestoreThread.  Then it returns to the calling Python
code.

I wasn't actually very confident that the patch was correct, but now that
I've actually written all that up and traced through the various levels of
thread control APIs in CPython, I'm relatively confident that the patch is
doing the right thing.  I also learned that there are some APIs in CPython
which do much of what pyOpenSSL is doing.  They aren't available in Python
2.3 though, so I'm not quite ready to switch over to them.

How does all this sound?

Jean-Paul