|
From: A.M. K. <aku...@us...> - 2002-12-17 14:31:49
|
Update of /cvsroot/py-howto/pyhowto
In directory sc8-pr-cvs1:/tmp/cvs-serv3716
Modified Files:
rexec.tex
Log Message:
Withdraw the rexec HOWTO
Index: rexec.tex
===================================================================
RCS file: /cvsroot/py-howto/pyhowto/rexec.tex,v
retrieving revision 1.12
retrieving revision 1.13
diff -C2 -r1.12 -r1.13
*** rexec.tex 26 Nov 2002 16:05:50 -0000 1.12
--- rexec.tex 17 Dec 2002 14:31:42 -0000 1.13
***************
*** 3,7 ****
\title{Restricted Execution HOWTO}
! \release{2.0}
\author{A.M. Kuchling}
--- 3,7 ----
\title{Restricted Execution HOWTO}
! \release{2.1}
\author{A.M. Kuchling}
***************
*** 14,559 ****
\begin{abstract}
\noindent
- Python provides a restricted execution mode for running untrusted code
- that will prevent the code from performing dangerous operations. This
- HOWTO explains how to use restricted execution mode, and how to
- customize the restricted environment for your application. It aims to
- provide a gentler introduction than the corresponding section in the
- Python Library Reference.
! This document is available from the Python HOWTO page at
! \url{http://www.python.org/doc/howto}.
\end{abstract}
- \tableofcontents
-
- \section{Basic use of \class{RExec}}
-
- For some applications, it's desirable to execute chunks of Python code
- that come from an outside source. The most obvious example is a Web
- browser such as Grail, which can download and execute applets written in
- Python.
-
- An obvious danger of downloading and running code from anywhere is that
- someone might write a malicious applet that appears to be harmless, but
- silently erases files, makes copies of sensitive data, or gives the
- applet's author a back door into your system. The solution is to run
- the code in a restricted environment, where it's prevented from
- performing any operations that could be used maliciously.
-
- Java does this by using the Java Virtual Machine, which executes Java
- bytecode. The virtual machine, or VM, has complete control over the
- running applet, and any dangerous operations must go through the VM in
- order to be performed. The VM can therefore trap suspicious activity,
- and stop the applet's execution, if a strict security policy is used, or
- ask the user if the operation should be permitted, if the policy is
- somewhat looser.
-
- Python already has a virtual machine that executes Python byte codes, so
- creating a restricted execution environment simply requires sealing off
- dangerous built-in functions such as \code{open()}, and dangerous
- modules, such as the \code{socket} module. This can be done by creating
- new namespaces, removing any dangerous functions, and forcing code to be
- executed in those namespaces. While a simple idea, in practice it's
- fairly complicated to implement. Luckily, the required features have
- been present in Python for a while, and it's already been implemented
- for you as a standard module.
-
- Code for using a restricted execution environment is in the \file{rexec}
- module. The base class is called \class{RExec}; in a later section of
- this HOWTO, we'll show you how to create your own subclasses of
- \class{RExec} to customize the functions and modules that are available.
- Here's the documentation for creating a new \class{RExec} instance:
-
- \begin{funcdesc}{RExec}{[\var{hooks}], [\var{verbose}] }
- Returns a \class{RExec} instance. The \var{verbose} parameter is a
- Boolean value, defaulting to false. If true, the \class{RExec} instance
- will execute in verbose mode, which will print a debugging message when
- modules are imported, as if the \code{-v} option was given to the Python
- interpreter.
-
- The \var{hooks} parameter can be an instance of the \code{RHooks}
- class, or of some subclass of \code{RHooks}; a default instance will
- be used if the parameter is omitted. This is only required when
- creating particularly exotic restricted environments that import
- modules in new ways. If you need to use this, you'll have to
- consult the source code (or Guido) for a complete picture of what's
- going on.
- \end{funcdesc}
-
- The \class{RExec} instance has \code{r_exec()}, \code{r_eval()}, and
- \code{r_execfile()} functions, which do the same thing as Python's
- built-in \code{exec()}, \code{eval()}, and \code{execfile()} functions,
- performing them in the restricted environment. (There are also
- \code{s_exec()}, \code{s_eval()}, and \code{s_execfile()} methods which
- replace the restricted environment's standard input, output, and error
- files with \code{StringIO} objects that allow you to control the input
- and capture any output generated.)
-
- Here's a sample usage of a restricted environment. First, the
- \class{RExec} instance has to be created.
-
- \begin{verbatim}
- r_env = rexec.RExec()
- \end{verbatim}
-
- Now, we can execute code and evaluate expressions
- in the environment:
-
- \begin{verbatim}
- r_env.r_exec('import string')
- expr = 'string.upper("This is a test")'
- print r_env.r_eval( expr )
- \end{verbatim}
-
- The first line executes a statement, importing the \code{string} module.
- Since it's considered a safe module, the operation succeeds. The second
- and third lines create a string containing an expression, and evaluates
- the expression in the restricted environment; it prints out \samp{THIS
- IS A TEST}, as you'd expect.
-
- Unsafe operations trigger an exception. For example:
-
- \begin{verbatim}
- r_env.r_exec('import socket')
- \end{verbatim}
-
- The previous line will cause an \code{ImportError} exception to be
- raised, with an associated string value that reads "untrusted dynamic
- module: _socket". Trying to open a file for writing is also forbidden:
-
- \begin{verbatim}
- r_env.r_exec('file = open("/tmp/a.out", "w")')
- \end{verbatim}
-
- This will raise an \code{IOError} exception, with an assocated string
- value that reads "can't open files for writing in restricted mode". The
- restricted code can catch the exception in a \code{try...except} block
- and continue running; this is useful for writing code which works in
- both restricted and unrestricted mode. Opening files for reading will
- work, however.
-
- Exactly what restrictions does the base \class{RExec} impose? It limits
- the modules that can be imported to the following safe list:
-
- \begin{verbatim}
- audioop, array, binascii, cmath, errno, imageop,
- marshal, math, md5, operator, parser, regex,
- pcre, rotor, select, strop, struct, time
- \end{verbatim}
-
- In general, these are modules that can't affect anything outside of
- the executing code; they allow various forms of computation, but don't
- allow operations that change the filesystem or use network connections
- to other machines. (The \code{pcre} module may be unfamiliar. It's
- an internal module used by the \module{re} module, so restricted code
- can still use the \module{re} to perform regular expression matches.)
-
- It also restricts the variables and functions that are available from
- the \code{sys} and \code{os} modules. The \code{sys} module only
- contains the following symbols:
-
- \begin{verbatim}
- ps1, ps2, copyright, version, platform, exit, maxint
- \end{verbatim}
-
- The \code{os} module is reduced to the following functions:
-
- \begin{verbatim}
- error, fstat, listdir, lstat, readlink,
- stat, times, uname, getpid, getppid,
- getcwd, getuid, getgid, geteuid, getegid
- \end{verbatim}
-
- Note that restricted code has some read-only access to the filesystem
- via functions like \code{os.stat} and \code{os.readlink}; if you wish to
- forbid all access to the filename, these functions must be removed.
-
- In restricted mode, there are various attributes of function and class
- objects that are no longer accessible: the \code{__dict__} attribute of
- class, instance and module objects; the \code{__self__} attribute of
- method objects; and most of the attributes of function objects, namely
- \code{func_code}, \code{func_defaults}, \code{func_doc},
- \code{func_globals}, and \code{func_name}.
-
- The \code{__import__()} and \code{reload()} functions are replaced by
- versions which implement the above restrictions. Finally, Python's
- usual \code{open()} function is removed and replaced by a restricted
- version that only allows opening files for reading.
-
- To change any of these policies, whether to be stricter or looser, see
- the section below on customizing the restricted environment.
-
- \section{Frequently Asked Questions}
-
- \emph{How do I guard against denial-of-service attacks? Or, how do I
- keep restricted code from consuming a lot of memory?}
-
- Even if restricted code can't open sockets or write files, it can still
- cause problems by entering an infinite loop or consuming lots of memory;
- this is as easy as coding \code{while 1: pass} or \code{'a' *
- 12345678901}. Unfortunately, there's no way at present to prevent
- restricted code from doing this. The Python process may therefore
- encounter a \code{MemoryError} exception, loop forever, or be killed by
- the operating system.
-
- One solution would be to perform \code{os.fork()} to get a child process
- running the interpreter. The child could then use the \code{resource}
- module to set limits on the amount of memory, stack space, and CPU time
- it can consume, and run the restricted code. In the meantime, the
- parent process can set a timeout and wait for the child to return its
- results; if the child takes too long, the parent can conclude that the
- restricted code looped forever, and kill the child process.
-
- \emph{If restricted code returns a class instance via \code{r_eval()},
- can that class instance do nasty things if unrestricted code calls its
- methods?}
-
- You might be worried about the handling of values returned by
- \code{r_eval()}. For example, let's say your program does this:
-
- \begin{verbatim}
- value = r_env.r_eval( expression )
- print str(value)
- \end{verbatim}
-
- If \code{value} is a class instance, and has a \code{__str__} method,
- that method will get called by the \code{str()} function. Is it
- possible for the restricted code to return a class instance where the
- \code{__str__} function does something nasty? Does this provide a way
- for restricted code to smuggle out code that gets run without
- restrictions?
-
- The answer is no. If restricted code returns a class instance, or a
- function, then, despite being called by unrestricted code, those
- functions will always be executed in the restricted environment. You
- can see why if you follow this little exercise. Run the interpreter in
- interactive mode, and create a sample class with a single method.
-
- \begin{verbatim}
- >>> class C:
- ... def f(self): print "Hi!"
- ...
- \end{verbatim}
-
- Now, look at the attributes of the unbound method \code{C.f}:
-
- \begin{verbatim}
- >>> dir(C.f)
- ['__doc__', '__name__', 'im_class', 'im_func', 'im_self']
- \end{verbatim}
-
- \code{im_func} is the attribute we're interested in; it contains the
- actual function for the method. Look at the function's attributes using
- the \code{dir()} built-in function, and then look at the
- \code{func_globals} attribute.
-
- \begin{verbatim}
- >>> dir(C.f.im_func)
- ['__doc__', '__name__', 'func_code', 'func_defaults', 'func_doc',
- 'func_globals', 'func_name']
- >>> C.f.im_func.func_globals
- {'__doc__': None, '__name__': '__main__',
- '__builtins__': <module '__builtin__'>,
- 'f': <function f at 1201a68b0>,
- 'C': <class __main__.C at 1201b35e0>,
- 'a': <__main__.C instance at 1201a6b10>}
- \end{verbatim}
-
- See how the function contains attributes for its \code{__builtins__}
- module? This means that, wherever it goes, the function will always use
- the same \code{__builtin__} module, namely the one provided by the
- restricted environment.
-
- This means that the function's module scope is limited to that of the
- restricted environment; it has no way to access any variables or
- methods in the unrestricted environment that is calling into the
- restricted environment.
-
- \begin{verbatim}
- r_env.r_exec('def f(): g()\n')
- f = r_env.r_eval('f')
- def g(): print "I'm unrestricted."
- \end{verbatim}
-
- If you execute the \code{f()} function in the unrestricted module, it
- will fail with a \code{NameError} exception, because \code{f()} doesn't
- have access to the unrestricted namespace. To make this work, you'd
- must insert \code{g} into the restricted namespace. Be careful when
- doing this, since \code{g} will be executed without restrictions; you
- have to be sure that \code{g} is a function that can't be used to do
- any damage. (Or is an instance with no methods that do anything
- dangerous. Or is a module containing no dangerous functions. You get
- the idea.)
-
-
- \emph{What happens if restricted code raises an exception?}
-
- The \module{rexec} module doesn't do anything special for exceptions
- raised by restricted code; they'll be propagated up the call stack
- until a \code{try...except} statement is found that catches it. If
- no exception handler is found, the interpreter will print a traceback and exit, which
- is its usual behaviour. To prevent untrusted code from terminating
- the program, you should surround calls to \code{r_exec()},
- \code{r_execfile()}, etc. with a \code{try...except} statement.
-
- Python 1.5 introduced exceptions that could be classes; for more
- information about this new feature, consult
- \url{http://www.python.org/doc/essays/stdexceptions.html}.
- Class-based exceptions present a problem; the separation between
- restricted and unrestricted namespaces may cause confusion. Consider
- this example code, suggested by Jeff Rush.
-
- t1.py:
- \begin{verbatim}
- # t1.py
-
- from rexec import RHooks, RExec
- from t2 import MyException
- r= RExec( )
-
- print 'MyException class:', repr(MyException)
- try:
- r.r_execfile('t3.py')
- except MyException, args:
- print 'Got MyException in t3.py'
- except:
- print 'Missed MyException "%s" in t3.py' % repr(MyException)
- \end{verbatim}
-
- t2.py
- \begin{verbatim}
- #t2.py
-
- class MyException(Exception): pass
- def myfunc():
- print 'Raising', `MyException`
- raise MyException, 5
-
- print 't2 module initialized'
- \end{verbatim}
-
- t3.py:
- \begin{verbatim}
- #t3.py
- import sys
- from t2 import MyException, myfunc
- myfunc()
- \end{verbatim}
-
- So, \file{t1.py} imports the \code{MyException} class from
- \file{t2.py}, and then executes some restricted code that also imports
- \file{t2.py} and raises \code{MyException}. However, because of the
- separation between restricted and unrestricted code, \code{t2.py} is
- actually imported twice, once in each mode. Therefore two distinct
- class objects are created for \code{MyException}, and the
- \code{except} statement doesn't catch the exception because it seems
- to be of the wrong class.
-
- The solution is to modify \file{t1.py} to pluck the class object out
- of the restricted environment, instead of importing it. The following
- code will do the job, if added to \code{t1.py}:
-
- \begin{verbatim}
- module = r.add_module('__main__')
- mod_dict = module.__dict__
- MyException = mod_dict['MyException']
- \end{verbatim}
-
- The first two lines simply get the dictionary for the \code{__main__}
- module; this is a usage pattern discussed above. The last line simply
- gets the value corresponding to 'MyException', which will be the class
- object for \code{MyException}.
-
- \section{Customizing The Restricted Environment}
- \label{sect-customizing}
-
- \subsection{Inserting Variables}
-
- While restricted code may be completely self-contained, it's common for
- it to require other data: perhaps a tuple listing various available
- plug-ins, or a dictionary mapping symbols to values. For simple Python
- data types, such as numbers and strings, the natural solution is to
- insert variables into one of the namespaces used by the restricted
- environment, binding the desired variable name to the value.
-
- Continuing from the examples above, you can get the dictionary
- corresponding to the restricted module named \code{module_name} with the
- following code:
-
- \begin{verbatim}
- module = r_env.add_module(module_name)
- mod_dict = module.__dict__
- \end{verbatim}
-
- Despite its name, the \code{add_module()} method actually only adds the
- module if it doesn't already exist; it returns the corresponding module
- object, whether or not the module had to be created.
-
- Most commonly, you'll insert variable bindings into the \code{__main__}
- or \code{__builtins__} module, so these will be the most frequent values
- of \code{module_name}.
-
- Once you have the module's dictionary, you need only insert a key/value
- pair for the desired variable name and value. For example, to add a
- \code{username} variable:
-
- \begin{verbatim}
- mod_dict['username'] = "Kate Bush"
- \end{verbatim}
-
- Restricted code will then have access to this variable.
-
- \subsection{Allowing Access to Unrestricted Objects}
-
- Often, the code being executed will need access to various objects that
- exist outside the restricted environment. For example, an applet should
- be able to read some attributes of the object representing the browser,
- or needs access to the \code{Tkinter} module to provide a GUI display.
- But the browser object, or the \code{Tkinter} module aren't safe, so
- what can be done?
-
- The solution is in the \code{Bastion} module, which lets you create
- class instances that represent some other Python object, but deny access
- to certain sensitive attributes or methods.
-
- \begin{funcdesc}{Bastion}{\var{object}, [\var{filter}], [\var{name}],
- [\var{class}] }
-
- Return a \code{Bastion} instance protecting the class instance
- \var{object}. Any attempt to access one of the object's attributes will
- have to be approved by the \var{filter} function; if the access is
- denied an \code{AttributeError} exception will be raised.
-
- If present, \var{filter} must be a function that accepts a string
- containing an attribute name, and returns true if access to that
- attribute will be permitted; if \var{filter} returns false, the access
- is denied. The default filter denies access to any function beginning
- with an underscore \samp{_}. The bastion's string representation
- will be \code{<Bastion for \var{name}>} if a value for
- \var{name} is provided; otherwise, \code{repr(\var{object})} will be used.
-
- \var{class}, if present, would be a subclass of \code{BastionClass};
- see the code in \file{bastion.py} for the details. Overriding the
- default \code{BastionClass} will rarely be required.
- \end{funcdesc}
-
- So, to safely make an object available to restricted code, create a
- \code{Bastion} object protecting it, and insert the \code{Bastion}
- instance into the restricted environment's namespace.
-
- For example, the following code will create a bastion for an instance,
- named \code{S}, that simulates a dictionary. We want restricted code to
- be able to set and retrieve values from \code{S}, but no other
- attributes or methods should be accessible.
-
- \begin{verbatim}
- import Bastion
- maindict = r_env.modules['__main__'].__dict__
- maindict['S'] = Bastion.Bastion(SS,
- filter = lambda name: name in ['__getitem__', '__setitem__'] )
- \end{verbatim}
-
- \subsection{Modifying Built-ins}
-
- Often you'll wish to customize the restricted environment in various
- ways, most commonly by adding or subtracting variables or functions from
- the modules available. At a more advanced level, you might wish to
- write replacements for existing functions; for example, a Web browser
- that executes Python applets would have an import function that allows
- retrieving modules via HTTP and importing them.
-
- An easy way to add or remove functions is to create the \class{RExec}
- instance, get the namespace dictionary for the desired module, and add
- or delete the desired function. For example, the \class{RExec} class
- provides a restricted \code{open()} that allows opening files for
- reading. If you wish to disallow this, you can simply delete 'open'
- from the \class{RExec} instance's \code{__builtin__} module.
-
- \begin{verbatim}
- module = r_env.add_module('__builtin__')
- mod_dict = module.__dict__
- del mod_dict['open']
- \end{verbatim}
-
- (This isn't enough to prevent code from accessing the filesystem;
- the \class{RExec} class also allows access
- via some of the functions in the \code{posix} module, which is usually
- aliased to the \code{os} module. See below for how to change this.)
-
- This is fine if only a single function is being added or removed, but
- for more complicated changes, subclassing the \class{RExec} class is a
- better idea.
-
- Subclassing can potentially be quite simple. The \class{RExec} class
- defines some class attributes that are used to initialize the restricted
- versions of modules such as \code{os} and \code{sys}. Changing the
- environment's policy then requires just changing the class attribute in
- your subclass. For example, the default environment allows restricted
- code to use the \code{posix} module to get its process and group ID. If
- you decide to disallow this, you can do it with the following custom
- class:
-
- \begin{verbatim}
- class MyRExec(rexec.RExec):
- ok_posix_names = ('error', 'fstat', 'listdir', 'lstat', 'readlink',
- 'stat', 'times', 'uname')
- \end{verbatim}
-
- More elaborate customizations may require overriding one of the methods
- called to create the corresponding module. The functions to be
- overridden are \code{make_builtin}, \code{make_main},
- \code{make_osname}, and \code{make_sys}. The \code{r_import},
- \code{r_open}, and \code{r_reload} methods are made available to
- restricted code, so by overriding these functions, you can change the
- capabilities available.
-
- For example, defining a new import function requires overriding
- \code{r_import}:
-
- \begin{verbatim}
- class MyRExec(rexec.RExec):
- def r_import(self, mname, globals={}, locals={}, fromlist=[]):
- raise ImportError, "No imports allowed--ever"
- \end{verbatim}
-
- Obviously, a less trivial function could import modules using HTTP, or do something else of interest.
-
- \section{References}
-
- See some of the papers on the Knowbot Programming Environment on
- CNRI's publications page: ``Knowbot programming: System support for
- mobile agents'', at
- \url{http://www.cnri.reston.va.us/home/koe/papers/iwooos-full.html}, and ``Using
- the Knowbot Operating Environment in a Wide-Area Network'', at
- \url{http://www.cnri.reston.va.us/home/koe/papers/mos.html}.
-
- For information on Java's security model, consult the Java Security
- FAQ at \url{http://java.sun.com/sfaq/index.html}.
-
- Perl supports similar features, via a software package called Penguin
- developed by Felix Gallo.
- Humberto Ortiz Zuazaga wrote a paper called "The Penguin Model for
- Secure Distributed Internet Scripting", at
- \url{http://www.hpcf.upr.edu/~humberto/documents/penguin-safe-scripting.html}.
- Thanks to Fred Drake for bringing it to my attention.
-
- Work has also been done on Safe-Tcl; see ``The Safe-Tcl Security
- Model'', by Jacob Y. Levy, Laurent Demailly, John K. Ousterhout, and
- Brent B. Welch, in the Proceedings of the 1998 USENIX Annual Technical
- Conference. Usenix members can access the paper online at
- \url{http://www.usenix.org/publications/library/proceedings/usenix98/levy.html}.
-
-
- The Janus project provides a secure environment for untrusted helper
- applications by trapping unsafe system calls. The project page is
- \url{http://www.cs.berkeley.edu/~daw/janus/}. Thanks to Paul Prescod
- for suggesting it.
-
- Can you suggest other links, or some academic references, for this section?
\section{Version History}
! Sep. 12, 1998: Minor revisions and added the reference to the Janus project.
Feb. 26, 1998: First version. Suggestions are welcome.
--- 14,32 ----
\begin{abstract}
\noindent
! Python provides a \module{rexec} module running untrusted code.
! However, it's never been exhaustively audited for security and it
! hasn't been updated to take into account recent changes to Python such
! as new-style classes. Therefore, the
! \module{rexec} module should not be trusted. To discourage use of
! \module{rexec}, this HOWTO has been withdrawn.
\end{abstract}
\section{Version History}
! Sep. 12, 1998: Minor revisions and added the reference to the Janus
! project.
Feb. 26, 1998: First version. Suggestions are welcome.
***************
*** 564,567 ****
--- 37,42 ----
Oct. 4, 2000: Checked with Python 2.0. Minor rewrites and fixes made.
Version number increased to 2.0.
+
+ Dec. 17, 2002: Withdrawn.
\end{document}
|