New submission from Anselm Kruis <a.kruis@...>:
The zxjdbc PyConnection class contains two HashSets -
"private Set<PyCursor> cursors;" and
"private Set<PyStatement> statements;"
I found the following issue with the "cursors" set:
The PyConnection#cursor() methods create PyCursor objects and add these objects to the "cursors" set. The PyCursor#close() method removes the cursors from the "cursors" set.
PEP249 defines the PyCurser#close() method: "Close the cursor now (rather than whenever __del__ is called)." This wording implies, that it is not required to explicitly close a cursor, because __del__ will close it anyway.
Now, if one doesn't close a zxjdbc PyCursor manually and the cursor object becomes unreachable from python code, we get a memory leak, because the hard reference of the PyConnection to the PyCursor object prevents the finalisation of the cursor. That's a problem for long running database connections, which perform heavy batch processing. (I found this issue writing a batch import script for a django on jython powered site.)
I propose to change the current implementation of the Set-fields in
PyConnection to become a WeakHashSet (actually a WeakHashMap using only the keySEt, because the java runtime lacks a WeakHashSet). This way, the PyConnection still can close all cursors in its PyConnection#close method, but doesn't prevent PyCursor from being finalized/garbage collected.
During the development of the attached patch, I found the same issue probably exists with the "statements"-set. Therefore I made this set a weak set too. I verified the patch using the mat heap dump analyser.
title: com.ziclix.python.sql.PyConnection leaks memory
versions: 2.5.0, 2.5.1
Added file: http://bugs.jython.org/file774/PyConnection.diff
Jython tracker <report@...>