#59 Zombie Processes Consuming All RAM

closed
Amorilia
PyFFI (66)
5
2012-10-27
2011-08-13
Alphanos
No

Good evening,

I\'ve been trying a number of things to resolve this issue, but I think I\'ve reached the limits of what I can figure out to attempt.

I\'ve been experiencing an issue with PyFFI where as it continues to run, a pool of zombie processes accumulates until they manage to consume all system memory and swap, resulting in the system crashing. This occurs on a fully updated Windows XP machine using Python v2.7.2 or v2.7.1, PyFFI v2.1.9 or 2.1.10 beta 3. Other people are obviously not having the same problem, so it appears there\'s something different about my setup, which I have not been able to identify.

This is possibly related to an earlier bug https://sourceforge.net/tracker/index.php?func=detail&aid=2787602&group_id=199269&atid=968813 which I found in a search of the tracker, but is happening with modern versions of all involved software.

This problem can be somewhat worked around by only processing small batches of meshes at a time, as once PyFFI completes its assigned task all of these zombie processes are terminated properly. I\'ve found somewhere around 2,000 meshes to be a sweet spot, and testing indicates that it is based upon the quantity of mesh files rather than the file size involved.

As trying various combinations of earlier versions made no difference, I am now using Python v2.7.2, PyFFI v2.1.10 beta 3, the PyFFI optimization kit v7 found here http://www.tesnexus.com/downloads/file.php?id=37463, and Arthmoor\'s update found here http://www.4shared.com/file/gvbLWTgs/OptKitFixes.html for its processing of far nifs. Note that this issue occurs regardless of using the kit or PyFFI\'s standard ini files, right-click options, etc, so I have once again returned to currently adopted best practices for optimization of Oblivion meshes. If you are unfamiliar, the kit simply runs 4 spells in sequence to produce the best results on a varied set of input meshes.

Examining the processes involved in Process Explorer, I observed the command line for the oldest read \"C:\Program Files\Python 2.7\python.exe\" \"-c\" \"from multiprocessing.forking import main; main()\" \"--multiprocessing-fork\" \"1700\". I went down the list and the numbers slowly decreased, with the lowest numbers at the bottom being the active processes still using CPU time. Other numbers I had time to write down, in sequence, were 1680, 1672, 1616, 1600, 1596, 1548, 1532, 1528, 1480, 1464, 1460, 1412, 1396.

Shortly after I wrote down these numbers the first spell of the kit completed, and it began the others, which are much faster. Interestingly, I noticed that precisely the same numbers were appearing as zombie processes while processing subsequent spells. While I am only guessing, if the numbers indicate certain files, perhaps there are specific files or file types that are resulting in a process not being terminated?

I have attached an archive containing my most recent set of log files, along with the kit\'s ini files and batch file which were used to generate them.

Please let me know what other information I may be able to provide, or any other tests I could run which might be of help.

Thanks!

Discussion

  • Alphanos

    Alphanos - 2011-08-13

    Logs for the second, third, and fourth spells, along with the ini files used to create them

     
  • Alphanos

    Alphanos - 2011-08-13

    Primary log file; I was unable to fit it into Sourceforge's upload limit, so I had to split it.

     
  • Amorilia

    Amorilia - 2011-09-25

    Thanks for the report. This might take a few iterations before we figure out what's going wrong.

    First: do you see the same problem if you simply right-click oblivion_optimize.ini (in Program Files\PyFFI\utilities\toaster), i.e. are you certain that this isn't just a problem with whatever optimization kit you're using?

     
  • Alphanos

    Alphanos - 2011-09-25

    Hey Amorilia, great to see that you're back ;-).

    Yes, the problem occurred using any method I could come up with to launch PyFFI, including the standard INI's right-click method.

    From further experimentation since the original post, it seems that for whatever reason each process that gets launched to handle a group of files just won't end. At this point I'm assuming that something's hanging during its attempts to clean up or something.

    As posted on the PyFFI Optimization Kit thread, for the time being I've developed a workaround set of batch files to process folders containing meshes one at a time, invoking and terminating PyFFI separately for each. While this doesn't avoid the zombies, it keeps the number low enough to prevent them from eating all system RAM. Once PyFFI terminates completely then all the processes close as expected.

     
  • Amorilia

    Amorilia - 2011-09-25

    This might sound lame, but have you tried setting the number of jobs to 1? I suspect there might be something funny with multithreaded support on your system. Running avoids use of multithreading altogether (but will also cause python to slowly leak memory...).

     
  • Amorilia

    Amorilia - 2011-11-26

    Seems fixed. Closing.

     

Log in to post a comment.