Thread: [Pyparsing] Question/help with pyparsing
Brought to you by:
ptmcg
From: Vineet J. \(gmail\) <vin...@gm...> - 2007-11-09 21:52:19
|
Problem: I allow the users of my application to write small rules as python code. I use pylint to find errors in code they enter. As part of the user code they are required to enter a list (lista) and a couple of dicts (dict1, dict2) at the module level. I use lista, dict1, dict2 to add variables to the module dynamically at run time. The problem I'm having is that pylint complains of the dynamic variables that are set in lista and dict1 and dict2. So I was thinking of having a multiple stage effort to find syntax errors with the user python code. Step 1: Extract the lista, dict1, and dict2 with some pyparsing code. Step 2: Then convert list1, dict1, dict2 to valid python objects using pyparsing json conversion function Step 3: Run pylint on the python user code. Ignore errors for variables and functions defined in lista, dict1, dict2 Example User Python Code: Lista = ["variable1", "variable2"] RuleDict = { 'rule1':{'name1':'function1name', }, 'rule1':{'name1':'function1name', } } def user_logic(): print variable1 #Not error print variable2 #Not error print function1name #Not error print function1name #Not error asdfasjkdfsdkajflasj; #ERROR asfasdfasdf #ERROR How would I do step1 and step2 with pyparsing? I'm going to sign up for the oreily book and do some more reading on pyparsing this weekend, but any work that I can get from anyone on this list would be very helpful. Thanks, Vineet |
From: Ralph C. <ra...@in...> - 2007-11-10 10:58:01
|
Hi Vineet, > I allow the users of my application to write small rules as python > code. I use pylint to find errors in code they enter. As part of the > user code they are required to enter a list (lista) and a couple of > dicts (dict1, dict2) at the module level. I guess dict1's values can also be dicts, lists, etc. > I use lista, dict1, dict2 to add variables to the module dynamically > at run time. The problem I'm having is that pylint complains of the > dynamic variables that are set in lista and dict1 and dict2. So I was > thinking of having a multiple stage effort to find syntax errors with > the user python code. Have you considered using codeop.py to attempt to compile their Python list and dict code? If you're wary of them doing unwanted stuff in the code they provide then maybe opcode.py can then be used to scan through the compiled bytecode to check they're just doing simple dict and list construction? Cheers, Ralph. |
From: Vineet J. \(gmail\) <vin...@gm...> - 2007-11-10 13:32:50
|
Quick update. I ended up using: http://www.fauskes.net/nb/parsing-simulink/ to setup the configuration for my user models. The only change I had to make was that names were not allowed to have underscores in them. I tried to fix that by changing: mdlName = Word('$'+'.'+alphas+nums) to mdlName = Word('$'+'.'+alphas+nums+'_') and it worked. Cool. The above grammar spec requires that all config parameters are enclosed in one master System {} anything outside this is ignored. For example: System { Config1 { Name test } Config2 { Name test } } Works. However, Config1 { Name test } Config2 { Name test } Only captures Config1 and ignores config2. Any suggestions on how to make it work without everything having to be enclosed in {} |
From: Vineet J. \(gmail\) <vin...@gm...> - 2007-11-10 13:51:42
|
>> Have you considered using codeop.py to attempt to compile their Python list and dict code? That's a good idea. My current plan is to use part of the following receipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496746 with the followin restrictions unallowed_ast_nodes = [ 'Backquote', 'Exec', 'From', 'Global' 'GenExpr', 'GenExprFor', 'GenExprIf', 'GenExprInner', 'Getattr' 'Import', 'Power', 'TryExcept', 'TryFinally', 'Yield' ] # Deny evaluation of code if it tries to access any of the following builtins: unallowed_builtins = [ '__import__', 'chr', 'apply', 'basestring', 'buffer', 'callable', 'chr', 'classmethod', 'coerce', 'compile', 'complex', 'delattr', 'dir', 'divmod', 'eval', 'execfile', 'file', 'filter', 'frozenset', 'getattr', 'globals', 'hasattr', 'hex', 'id', 'input', 'intern', 'isinstance', 'issubclass', 'locals', 'map', 'object', 'oct', 'open', 'ord', 'pow', 'property', 'range', 'raw_input', 'reduce', 'reload', 'repr', 'reversed', 'round', 'set', 'setattr', 'staticmethod', 'super', 'type', 'unichr', 'unicode', 'vars', 'zip' ] I will also check check for use of * and ** with pyparsing. I will replace both of these with my wrappers around them to make sure that there are no cases for: 20000**11111111111111111111111111111111111 [1]*11111111111111111111111111111111111 etc. I think given this, I should be able to run untrusted code. Thanks, Vineet |
From: Ralph C. <ra...@in...> - 2007-11-11 10:51:26
|
Hi Vineet, > http://www.fauskes.net/nb/parsing-simulink/ > The above grammar spec requires that all config parameters are > enclosed in one master System {} anything outside this is ignored. I think it doesn't mandate "System" but any mdlName. > Config1 { > Name test > } > > Config2 { > Name test > } > > Only captures Config1 and ignores config2. Any suggestions on how to make it > work without everything having to be enclosed in {} Try changing mdlparser = mdlObject to mdlparser = OneOrMore(mdlObject) Cheers, Ralph. |
From: Ralph C. <ra...@in...> - 2007-11-11 10:52:57
|
Hi Vineet, > That's a good idea. My current plan is to use part of the following > receipe: > > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496746 It might be better from a security point of view to list what's allowed, having carefully considered the implications of each opcode, rather than list what's disallowed. If you miss an opcode by accident, the former will cramp what the user can express, the latter will let them escape your constraints. Cheers, Ralph. |