#26 The long-run usage is unexpectedly memory-consuming

Optimizations
closed-fixed
7
2005-05-27
2005-05-17
Francesco Garosi
No

The module is reported to be extremely memory-
consuming when it is used in long-run sessions. This
appears to happen in the low-level part of the module,
and involves the construct creation methods
(Build<entity>) as well as the Print<entity> functions.
An example, provided by the user who reported the
erratic behaviour, is attached (I previously removed any
possible identification).

Discussion

  • Test case for memory usage demonstration

     
  • JFGuilbert
    JFGuilbert
    2005-05-17

    Logged In: YES
    user_id=1280410

    Question:

    All your Python API functions of (clipsmodule.c) use
    RETURN_PYOBJECT to return from the shared library to your
    _clips_wrap.py code. That macro uses PY_INCREF for every
    reference returned. When a wrapped object
    (_clips_wrap.Fact, as an example) is destroyed, shouldn't
    there be a call to the library again to perform a PY_DECREF
    of the wrapped reference?

    Also, is there a need to increment the reference count of
    all the returned objects from the API? After checking other
    Python wrappers (sqlite as an example), if you would have
    internal references, then PY_INCREF/PY_DECREF should be used.

    From what I saw, returning objects created in shared
    libraries are, in most cases, not incremented before the
    return of the function. (Py_None seems to be the exception).

    When clips.Assert(<string>) is used from Python, that
    function creates a fact using: Fact(_c.assertString(o))
    making the result of _c.assertString an attribute of Fact.

    The object created in the library is incremented before the
    return of the function (using the RETURN_PYOBJECT macro).
    That Python attribute reference seems to never get decreased.

    Test:

    I've defined in clipsmodule.c, after #define
    RETURN_PYOBJECT, the following:

    #define RETURN_PYOBJECTNINC(_p) { return (PyObject *)(_p); }

    and changed <static PyObject *g_assertString> to use it
    instead of RETURN_PYOBJECT(p):
    /*RETURN_PYOBJECT(p);*/
    RETURN_PYOBJECTNINC(p);

    and re-executed the available test case using clips.Assert.
    The output is really close to the clips.SendCommand when
    references are not incremented.

    From Python's Documentation of <1.10.1 Reference Counting in
    Python>:
    "...The owner of a reference is responsible for calling
    Py_DECREF() when the reference is no longer needed...
    ...Forgetting to dispose of an owned reference creates a
    memory leak."

    I didn't build complex Python wrappers yet, but this
    PY_INCREF/PY_DECREF usage has an impact on instances created
    in the shared library and makes the CLIPS (or Python?) free
    up more memory when required. But, changing the PY_INCREF
    usage from the RETURN_PYOBJECT macro might lead to other
    memory issues.

    I also noticed that the repetitive use of
    clips.StdoutStream.Read() really makes the memory usage grow
    if CLIPS rules have a lot of <printout t> in them.

    The clipsmodule.c <g_routerRead> function also uses the
    RETURN_PYOBJECT macro, and replacing it with the test
    RETURN_PYOBJECTNINC one had the same beneficial effect on
    memory.

     
  • Logged In: YES
    user_id=328337

    Well, in fact I call Py_INCREF because I read that "The
    object reference returned from a C function that is called
    from Python must be an owned reference -- ownership is
    tranferred from the function to its caller" in the same
    page. Probably I misunderstood that part: I will check other
    modules to see what they do (I thought that I had to
    increment a reference count for the objects I create using
    <someobject>_New in order to ensure that the objects are not
    finalized by garbage collection). I will try not to
    increment references using your method (the RETURN_PYOBJECT
    macro was created just in case people like you found this
    issue), and to fire GC from time to time to see if the
    module causes a segmentation fault.

    What really surprised me before I read your last message, is
    that my memory leak affects the internal CLIPS memory (the
    one reported by clips.Memory.Used) which is somewhat apart
    from the Python interpreter. But whenever I create an object
    of type "Fact" or "Instance", i lock it (using CLIPS own
    locking) in order to avoid CLIPS to deallocate it using
    _its_ own garbage collector. Some of the issues within CLIPS
    memory are very likely to be related to this: the CLIPS lock
    remains active until the related PyObject is deleted.

    As soon as I'm back home (it's going to take some hours,
    however) I'll start testing PyCLIPS without incrementing
    references to newly created objects. In case all works as
    expected, I'll do my best to re-release the module as quick
    as possible.

    Thank you for pointing out this.

    Cheers,

    F.

     
  • Logged In: YES
    user_id=328337

    I think that your suggestion could actually solve the problem:
    there is only one point where garbage collection must not be
    able to destroy objects, that is in new environment allocation.

    I converted the RETURN_PYOBJECT() macro to the one
    without Py_INCREF (i better looked at the manual page, and
    it says that PyObject_New() already returns an object with
    reference count set to 1), and the test suite works. This is
    important, because the change affects Assert() as well as all
    the Build<entity>() functions, plus some more in the module.
    As a result it looks like the test case you provided uses the
    same memory for both API and non-API (the API version
    might be slightly more efficient, as expected, being more
    direct). I also noticed a better performance, probably due to
    the reduced memory allocation and garbage collection
    overhead.

    However, reviewing the code, I noticed that I have to keep a
    reference for each allocated environment, in order not to let
    Python actually free environment data, as I noticed that this
    could cause corruption of the CLIPS engine heap. I think that
    disallowing environment destruction can be a good
    compromise for now, as in most cases there should be a
    small number of environments per PyCLIPS (and thus
    Python) session to be kept alive from the beginning to the
    end. This should not be a problem for most applications.
    However I will open an RFE for this.

    The fixes have been committed to the CVS repository.
    However I also attach a file containing clipsmodule.c (and a
    small update to the tests_00.py file, which forces garbage
    collection at each test). I hope that the new version really
    solves the problem you found. I'm always here anyway for
    any other bugs (including the other issue we're working on).

    Cheers,

    F.

     
  • Possible fixes for bug 1203726 (rev. 1)

     
    Attachments
  • Possible fixes for bug 1203726 (rev. 2)

     
    Attachments
  • Logged In: YES
    user_id=328337

    I just committed to the CVS repository a new version of some
    of the source files: actually all module (code) files have
    changed at least in some parts. This version corrects the
    wrong use of Py_INCREF/Py_DECREF especially on
    function failure. I tested it both on Linux and on Win32 with
    successful and failed function calls.

    I also attach a zip file containing the C source files and the
    Python files affected by these recent changes.

    The testcase you provided succeeds, and the same results
    should be expected also with the Build<entity> functions and
    others that incremented reference counts in the wrong way.

    You can use the attached files to rebuild the module, and
    probably the results will be better regarding memory usage.
    In that case, I'll wait for your response to close the bug.

    Thank you again,

    F.

     
  • JFGuilbert
    JFGuilbert
    2005-05-27

    Logged In: YES
    user_id=1280410

    I've tested your module changes, within our 'long running'
    processes, using the API methods, over a few days and the
    memory results are near what I experienced with
    'SendCommand' for the same period of time.

    The memory usage is definitivelly more efficient and your
    last changes really makes a difference over repetitive use
    of the API.

    With this, I'm ready to start using the next official
    release of PyClips.
    Thanks again

    JF

     
  • Logged In: YES
    user_id=328337

    I'm happy that we finally found a way through the problem:
    your help has been invaluable to find the bug and correct it. I
    actually was hoping for these results, and have already
    prepared the new packages, which I will immediately submit
    to the file release area. By the way, have you tried to build
    PyCLIPS on a newer version of Python 2.3?

    I hope that you will either email me or post a message on SF
    for any other issues you should find: from what you write, I
    see that you are really stressing the module, and your
    results will be of particular interest for the "robustness" of
    PyCLIPS.

    Thank you again, hope to hear from you soon!

    F.

     
  • Logged In: YES
    user_id=328337

    This bug has been fixed and new packages have been uploaded to
    the file release area.

     
    • status: open --> closed-fixed