#303 Refcounting / GIL problems (dual)

closed-fixed
nobody
pythonwin (176)
9
2006-11-20
2006-11-04
kxroberto
No

I wonder if the below patch possibly addresses a
refcount/memory problem. Think its possible, that
DODECREF(this) frees <*this> from memory, another
released threads malloc uses the memory and
this->virtualInst is bogus?

( In preceding function ui_assoc_object::DoKillAssoc
there is a simlar technique to do "PyObject *vi =
virtualInst;" - but for reverse reason )

( I'm still in search of memory leaks/crashes on dual
cores and fast CPUs when other threads are going on
parallel to win32ui GUI mainthread. )

--- win32assoc.cpp.orig 2005-04-12 17:01:00.000000000 +0200
+++ win32assoc.cpp 2006-11-04 10:59:46.326646400 +0100
@@ -252,9 +252,10 @@
{
if (this==NULL) return NULL;
if (virtualInst) {
+ PyObject *vi = virtualInst;
+ DOINCREF(vi);
DODECREF(this);
- DOINCREF(virtualInst);
- return virtualInst;
+ return vi;
} else
return this;
}

Discussion

1 2 > >> (Page 1 of 2)
  • kxroberto
    kxroberto
    2006-11-04

    Logged In: YES
    user_id=972995

    (repeated)

    --- win32assoc.cpp.orig 2005-04-12 17:01:00.000000000 +0200
    +++ win32assoc.cpp 2006-11-04 10:59:46.326646400 +0100
    @@ -252,9 +252,10 @@
    {
    if (this==NULL) return NULL;
    if (virtualInst) {
    + PyObject *vi = virtualInst;
    + DOINCREF(vi);
    DODECREF(this);
    - DOINCREF(virtualInst);
    - return virtualInst;
    + return vi;
    } else
    return this;
    }

     
  • kxroberto
    kxroberto
    2006-11-04

    • priority: 5 --> 9
     
  • kxroberto
    kxroberto
    2006-11-04

    Logged In: YES
    user_id=972995

    Think I found other serious potential refcount/GIL problems
    which cause crashes here in apps with threads - mainly on
    dual core - and set priority to 9:

    I looked over GUI_BGN_SAVE's
    => Py_XINC/DECREF's outside of the GIL. this causes probably
    the special problems only on dual cores when multiple threads:

    win32cmdui/Line120:
    GUI_BGN_SAVE;
    Python_delete_assoc(ob);
    GUI_END_SAVE;
    win32uimodule/Line650:
    void Python_delete_assoc( void *ob )
    {
    // Notify Python object of my attached object removal.
    {
    CVirtualHelper helper ("OnAttachedObjectDeath", ob);
    <================
    helper.call();
    <================
    }
    ui_assoc_object *pObj;
    if ((pObj=ui_assoc_object::GetPyObject(ob)) &&
    !bInFatalShutdown) {
    CEnterLeavePython _celp; // KillAssoc requires it is held!
    pObj->KillAssoc();
    }
    }

    win32virt.cpp/Line22
    CVirtualHelper::CVirtualHelper(const char *iname, const void
    *iassoc, EnumVirtualErrorHandling veh/* = VEH_PRINT_ERROR */)
    {
    handler=NULL;
    py_ob = NULL;
    ...
    py_ob = py_bob;
    Py_INCREF(py_ob); <================
    // Py_XINCREF(handler);
    }

    The gross hack for speed in
    CVirtualHelper::~CVirtualHelper() will not work for dual cores.
    A fast "ZDODECREF/ZDODECREF" independent of GIL (unless it
    went to 0) would be possible on x86 with "__asm LOCK DEC
    <(op)->ob_refcnt>" or so.
    see:
    http://groups.google.de/group/comp.lang.python/msg/ef530d4448814391

    CVirtualHelper::~CVirtualHelper()
    {
    // XXX - Gross hack for speed. This is called for eachh
    window message
    // so only grabs the Python lock if the objects need Python,
    if ((retVal && retVal->ob_refcnt==1) ||
    <================ DECREF not atomic on multi-core CPU's
    (handler && handler->ob_refcnt==1) ||
    (py_ob && py_ob->ob_refcnt==1)) {
    CEnterLeavePython _celp;
    XDODECREF(retVal);
    XDODECREF(handler);
    XDODECREF(py_ob);
    } else {
    XDODECREF(retVal); <================ DECREF not
    atomic on multi-core CPU's
    XDODECREF(handler);
    XDODECREF(py_ob);
    }
    }

    Maybe there are other loose INCREFS / DECREFS in win32 code?

    And I found - probably most serious in this list - :
    Only CVirtualHelper::do_call is CEnterLeavePython protected:

    BOOL CVirtualHelper::do_call(PyObject *args)
    {
    CEnterLeavePython _celp;
    XDODECREF(retVal); // our old one.
    retVal = NULL;

    but all entrace call functions are not - but they are
    bulding up Python Values !
    Think that is serious (even on single cores where I have
    very rare crashes):
    e.g.:

    BOOL CVirtualHelper::call()
    {
    if (!handler) return FALSE;
    PyObject *arglst = Py_BuildValue("()"); <================
    return do_call(arglst);
    }

    =>Probably the CEnterleave protection should be moved out of
    do_call to all entrance call(...)'s & call_args() ?

    Looked in a rush over the 200 base DECREFs in Pythonwin.
    Found so far no other smelling refcounting.
    Only maybe this is invalid :

    --- win32app.cpp.orig 2006-01-10 15:38:00.000000000 +0100
    +++ win32app.cpp 2006-11-04 16:44:24.009694400 +0100
    @@ -412,6 +412,7 @@
    {
    PyCWinThread::cleanup();
    // total hack!
    - while (pExistingAppObject)
    + while (pExistingAppObject->ob_refcnt>1)
    DODECREF(pExistingAppObject); // this may
    delete it.
    + DODECREF(pExistingAppObject);
    }

     
  • kxroberto
    kxroberto
    2006-11-05

    Logged In: YES
    user_id=972995

    In attachment a patch candidate I've so far in action for
    all these things.
    (That issue around "pExistingAppObject" was nonsense.
    pExistingAppObject seems to be not set to anything at all)

    Notes:

    The speed DECREF's in ~CVirtualHelper are switched all off
    by this patch - as it seems to not be valid so outside of
    the GIL. Would require a __asm LOCK INC/DEC or so. Have not
    seen serious speed issues in apps so far.
    Yet I've seen: most often CVirtualHelp is instanciated -
    costing a enter/leave python state switch - and then
    immediately a helper.call(..) with again a enter/leave state
    switch. If optimization necessary, possibly by a
    CVirtualHelpE does it which enters Python in constructor
    until destruction and the .call's do not again have to
    enter. Only few locations, where the intermediate
    PythonLeave/GUI_BGN_SAVE is necessary.

     
  • kxroberto
    kxroberto
    2006-11-05

     
    Attachments
  • kxroberto
    kxroberto
    2006-11-08

    Logged In: YES
    user_id=972995

    Meanwhile there is significant feedback that
    c_GIL_bugs.patch in fact solved all the crash problems with
    threaded apps on all (dual core) machines.

     
  • kxroberto
    kxroberto
    2006-11-08

    • summary: Refcounting in ui_assoc_object::GetGoodRet --> Refcounting / GIL problems (dual)
     
  • Mark Hammond
    Mark Hammond
    2006-11-20

    Logged In: YES
    user_id=14198
    Originator: NO

    (Very slightly modified version) checked in - thanks!

    Checking in win32assoc.cpp;
    new revision: 1.8; previous revision: 1.7
    Checking in win32virt.cpp;
    new revision: 1.4; previous revision: 1.3

     
  • Mark Hammond
    Mark Hammond
    2006-11-20

    • status: open --> closed-fixed
     
  • kxroberto
    kxroberto
    2011-11-21

    I still have nasty problems with Python objects magically changing / loosing attributes shortly after creation, while using win32ui and heavy threading. Mainly on machines with many Cores/CPUs.

    Patch below: I found 2 strongly smelling locations in win32ui, which are probably responsible: where objects are DECREF'ed and possibly INCREF'ed soon after - while "hoping" that they do not disappear. But I guess this is not valid (today). A new Python object somewhere may use this memory meanwhile and then the object may be stolen back and going hybrid ...

    (Unfortunately I don't have the MS full version compilers to compile current MFC based stuff. (Or is this possible somehow with VC Express versions?) I could make quick tests of win32ui.pyd (py2.3 and py2.6)

    diff -ur --strip _orig/win32assoc.cpp ./win32assoc.cpp
    --- _orig/win32assoc.cpp 2009-03-04 11:52:00 +0000
    +++ ./win32assoc.cpp 2011-11-21 13:10:42 +0000
    @@ -228,11 +228,11 @@
    // So set the instance to NULL _before_ we decref it!
    PyObject *old = pAssoc->virtualInst;
    pAssoc->virtualInst = NULL;
    - XDODECREF(old);
    if (ob!=Py_None) {
    pAssoc->virtualInst = ob;
    DOINCREF(ob);
    }
    + XDODECREF(old);
    RETURN_NONE;
    }

    diff -ur --strip _orig/win32cmd.cpp ./win32cmd.cpp
    --- _orig/win32cmd.cpp 2009-01-08 22:33:00 +0000
    +++ ./win32cmd.cpp 2011-11-21 13:10:26 +0000
    @@ -208,17 +208,20 @@
    RETURN_ERR("The parameter must be a callable object or None");

    void *oldMethod = NULL;
    - // note I maybe decref, then maybe incref. I assume object wont be destroyed
    - // (ie, ref go to zero) between the 2 calls!)
    + // we need incref's early in order to avoid a ref erronously going to zero during DEDECREF
    + if (method!=Py_None) {
    + Py_INCREF(method);
    + Py_INCREF(hookedObject);
    + }
    if (pList->Lookup(message, oldMethod)) {
    pList->RemoveKey(message);
    // oldMethod is returned - don't drop its reference.
    DODECREF(hookedObject);
    }
    if (method!=Py_None) {
    - Py_INCREF(method);
    + // already done above: Py_INCREF(method);
    pList->SetAt(message,method);
    - Py_INCREF(hookedObject);
    + // already done above: Py_INCREF(hookedObject);
    }
    if (oldMethod)
    return (PyObject *)oldMethod;

     
1 2 > >> (Page 1 of 2)