Re: [Jython-users] exec and globals, catching OutOfMemError

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Frank, Robert:

About catching java.lang.OutOfMemError:
when it is thrown from Java code (ie. from within the interpreter)
I could not catch it using Jython 21a3. It was thrown
while interpreting a jython function in infinite recursion.

In the example I saw earlier, java.lang.OutOfMemError was
thrown from within jython code, so it probably gets encapsulated
into a Jython exception, allowing it to be caught in jython code.

The code below allows you to execute a jython script from
within jython, much as if it was called from a shell as an
argument to the interpreter. Various sys things are adapted
and restored and an evt. exception is returned.
It also unloads modules, but it does not use jreload.

I edited it a bit to clarify some names, and the indentation
is probably affected by mail wrapping, but it should be
easy to get it running.

Have fun,
Ype

"""
Execute python scripts from a search path (a set of directories).
Allow the scripts to import modules from their own directory,
from the script search path and from sys.path.
Allow unloading of modules from the first directory on the search path,
It is intended to use a local directory and a network directory,
allowing for private (local) and shared script modules.
In this way only private modules can be unloaded.

All file names handled here are normalized: lower case is used and single
backward slashes replace (sequences of) forward and backward slashes.
(In jython 21a2 (1.1.8, OS2) sys.path contains a strange mix of upper and lower
case.)

Appropriate sys.argv, sys.path, and globals are set for the scripts.
and sys.modules['__main__'] is adapted to the new module
during its execution.
The __name__ variable for the started script is set to '__main__'.
Also sys.stdout (ie. print statements in scripts) is redirected
during execution of scripts.

An evt. exception from the script is caught and returned.

Limitations.

In jython 21a2 imported modules must have a lowercase .py extension.
(Before 21a2 .PY uppercase was also ok.
This is considered a bug and it is announced to be fixed in
the next jython release.)

In case the script goes into an infinite loop, the
only way out is by closing the process, eg. from the process list.

In case the script goes into infinite recursion, the process will
terminates silently (OS2 JVM 1.1.8) when it runs out of stack.
Unfortunately in Java it is not possible to limit the recursion depth,
and it is not easy to provide a better alternative.
In case too much memory is required for python data the process
exits with a suggestion to use -mx to increase available memory.
In both of these cases java.lang.Throwable is not caught, although
the except: clause is there.
Since both cases are allmost always caused by a bug in the executed
script, it is recommended to inspect the suspect code, fix it and/or add
some trace statements to locate the problem, and to try again.

One alternative might be to trace all function calls and monitor the
nr. of entries on the stack, but this would impose a large
overhead in execution time.
CPython allows to limit the recursion depth, Jython has no
facility for this.

Unloading of modules is done by removing the references in sys.modules.
Actual unloading is left to the garbage collector. Better facilities
may become available in Jython under Java 1.2 (see java class unloading
in the Jython documentation.)

Compiled modules ($py.class files for imported .py modules)
contain their original path name.
Therefore after renaming a package directory all its class files should
be deleted, otherwise unloading might not be done correctly.

The function sys.exc_info() is used in such a way that a circular
reference would be created in CPython. Earlier versions of CPython
might have problems when collecting this as garbage.
"""

import traceback, sys, os, time
import java # java.lang.Throwable

StartingScript__name__ = '__main__'

from org.python.util import PythonInterpreter

# Path handling
from os.path import normpath, normcase, dirname

doubleSep = os.sep + os.sep
sepDot = os.sep + '.' # current working directory ends in '\\,' on sys.path (21a2, 1.1.8, OS2): not yet a problem
sepDotSep = sepDot + os.sep

if os.sep == '\\': otherSep = '/'
else: otherSep = '\\'

def normalPath(pathName): # OS2 version
    p = normpath(normcase(pathName.lower().replace(otherSep, os.sep)))
    while p[1:].find(doubleSep) >= 0: # replace multiple backslashes by single, except in front (network drives)
        p = p[0] + p[1:].replace(doubleSep, os.sep)
    return p

PythonExtension = '.py' # Used for main programs only.

# End of path handling.

class ScriptExitException(Exception): pass

class ScriptInterpreter:

    def __init__(self, scriptDirs, notUnloadDir, scriptGlobals, logFunc, printFunc):
        self.scriptDirs = [(normalPath(scriptDir) + os.sep) for scriptDir inscriptDirs]
        self.scriptUnloadDir = self.scriptDirs[0] # lower case required by self.UnloadModules()
        # to not unload modules of the script interpreter itself:
        self.notUnloadDir = normalPath(notUnloadDir) + os.sep
        self.scriptPathCache = {}
        self.scriptGlobals = scriptGlobals.copy()
        self.scriptGlobals['__name__'] = StartingScript__name__
        self.logFunc = logFunc # log output from ScriptInterpreter

        self.printFunc = printFunc # used to redirect print statements from scripts, see below
        self.bufferedLine = None

        import new
        # replace sys.modules[StartingScript__name__] during execfile()
        self.mainModule = new.module(StartingScript__name__)
        # Note: self.mainModule.__dict__ should not be assigned to.

    def findScriptAndImportPaths(self, scriptName):
        """ for scriptName return tuple of full path name and system search path
        for importing modules
        """
        if self.scriptPathCache.has_key(scriptName): # cache non normalized: faster most of the time
            return self.scriptPathCache[scriptName]
        scriptName = normalPath(scriptName)
        fileName = scriptName + PythonExtension
        for scriptPathDir in self.scriptDirs:
            scriptPath = scriptPathDir + fileName
            try:
                os.stat(scriptPath)
            except OSError:
                self.logFunc(scriptPath + ' not found.')
            else:
                scriptDir = dirname(scriptPath) + os.sep # maybe os.path.abspath() should be used.
                # for tracking down os.path.dirname() platform differences:
                # print repr(scriptPath), repr(scriptDir)
                if scriptDir in self.scriptDirs: # import always from own dir first:
                    otherscriptDirs = self.scriptDirs[:]
                    otherscriptDirs.remove(scriptDir) # returns None, cannot substitute below
                    sysPath = [scriptDir] + otherscriptDirs + sys.path
                else:
                    sysPath = [scriptDir] + self.scriptDirs + sys.path
                res = (scriptPath, sysPath)
                self.scriptPathCache[scriptName] = res
                return res
        sysPath = self.scriptDirs + sys.path # not found, use default
        res = (scriptPath, sysPath)
        self.scriptPathCache[scriptName] = res
        return res

    def executeScript(self, scriptName, scriptArgs): # returns evt. exception info, or None.
        #java.lang.OutOfMemoryError falls through.
        savedArgv = sys.argv[:] # sys.argv changed in place below
        savedPath = sys.path
        savedStdout = sys.stdout
        try: savedMainMod = sys.modules[StartingScript__name__]
        except KeyError: savedMainMod = None

        (sys.argv[0], sys.path) = self.findScriptAndImportPaths(scriptName)
        sys.argv[1:] = scriptArgs

        sys.stdout = self # see self.flush() and self.write(), redirect print statements from scripts

        statusMsg = 'Unknown exception, internal error.'
        try:
            try:
                scriptGlobals = self.mainModule.__dict__ # avoid the dot operators using a local var.
                scriptGlobals.clear()
                scriptGlobals.update(self.scriptGlobals)
                # (evt. add predefinitions for scripts to scriptGlobals at this
point)
                sys.modules[StartingScript__name__] = self.mainModule # allow finding globals by module name
                # execfile() suggested by Finn Buck's email, Jan 2001
                execfile(sys.argv[0], scriptGlobals) # execfile does not do any module administration
            except ScriptExitException:
                statusMsg = 'Exiting'
                exceptionInfo = None
            except Exception, e: # all other exceptions, including including script file not found.
                statusMsg = 'Exception: ' + str(e)
                exceptionInfo = sys.exc_info()
            except java.lang.Throwable, jt: # only used when running jython
                # Does not catch an actual java.lang.OutOfMemoryError (Jython 21a1, 21a2, 21a3).
                # Catches java.lang.NullPointerException happening after an
                # internal compiler error "Name:.... at line .." while importing a module
                # with an undefined name. 12-4-2001 (This seems to be a Jython 21a1 bug.)
                statusMsg = 'Java Throwable ' + str(jt)
                exceptionInfo = sys.exc_info()
            except: # Hopefully this catches very very unusual errors. (Never seen.)
                statusMsg = 'Deprecated exception.' # not derived from standard Exception
                exceptionInfo = sys.exc_info()
                ei = exceptionInfo # Show the exception in the log.
                maxDepth = 20
                import linecache
                linecache.checkcache() # up to date code in the traceback
                tbText = ''.join(traceback.format_exception(ei[0], ei[1], ei[2], maxDepth))
                sys.stdout = savedStdout # restore stdout before using self.logFunc()
                self.logFunc(tbText) # try and get info on the error also from the log
            else:
                statusMsg = 'Ending'
                exceptionInfo = None
        finally: # all exceptions should be caught, but anyway
            self.flush()
            if savedMainMod: sys.modules[StartingScript__name__] = savedMainMod
            else: del sys.modules[StartingScript__name__]
            sys.stdout = savedStdout
            sys.path = savedPath
            sys.argv = savedArgv

        return exceptionInfo # Note: Eventual exceptions raised by the code in this are not caught.

    def ExitScript(self): # should only be called during the execfile() in self.executeScript()
        raise ScriptExitException()

    def UnloadModules(self): 
        """ remove from the modules loaded from self.scriptUnloadDir, and clear self.scriptPathCache """
        self.scriptPathCache.clear()
        scriptUnloadDirLen = len(self.scriptUnloadDir)
        namesMods = sys.modules.items()
        namesMods.sort()
        namesMods.reverse() # delete packages after the modules they contain.
        for modName, mdl in namesMods: # iterate copy, remove from original below.
            if hasattr(mdl, '__file__'):
               fileName = mdl.__file__.replace(sepDotSep, os.sep).lower()
               if fileName.startswith(self.scriptUnloadDir) and not fileName.startswith(self.notUnloadDir):
                   self.logFunc('Unloading module ' + modName + ' (' + fileName + ')')
                   del sys.modules[modName] # remove the reference, actual unloading left to garbage collection.

    # maintain self.bufferedLine for self.write() and self.flush()
    def appendToBufLine(self, text):
        if self.bufferedLine is None:
            self.bufferedLine = text
        elif text:
            self.bufferedLine += text

    # as sys.stdout
    def flush(self):
        if self.bufferedLine is not None:
            self.printFunc(self.bufferedLine)
            self.bufferedLine = None

    def write(self, text):
        if len(text) and text.endswith('\n'):
            line = text[:-1]
            assert line.find('\n') == -1, "line should be split at ends of lines" # CHECKME
            self.appendToBufLine(line)
            self.flush()
        else:
            self.appendToBufLine(text)