From: <duv...@ya...> - 2005-08-08 08:55:10
|
Dear, I wanted to match the HTML / XML comments with a regular expression to be able to remove them. The following expression : regex = re.compile(r'(<!--(\s*\S+\s*)-->)+', re.I) Is okay but fails on HTML/XHTML files of 150+ lines with a : RuntimeError: maximum recursion limit exceeded Somebody (Armin Ehrenfels) proposed to replace the regex by this one : regex = re.compile(r'(<!--(?:[ ]|\s*(?!-->)(?:\S(?:.(?!-->))*.))-->)', re.M|re.S) which works fine in CPython (even on a 11Mb file with more than 2000 matches) but still fails when running on Jyhton (2 matches spread over 150 lines). Is there a way to allows more recursion with Jython ? Our test server has 8 Gb of RAM ... there might be a way to push it thru' ... ? Tx, \T, Any fool can write code that a computer can understand. Good programmers write code that humans can understand. Martin Fowler T. : +32 (0)2 742 05 94 M. : +32 (0)497 44 68 12 |
From: Diez B. R. <de...@we...> - 2005-08-08 09:13:24
|
duv...@ya... wrote: > Dear, > > I wanted to match the HTML / XML comments with a regular expression to be able to remove them. > The following expression : > regex = re.compile(r'(<!--(\s*\S+\s*)-->)+', re.I) > Is okay but fails on HTML/XHTML files of 150+ lines with a : > RuntimeError: maximum recursion limit exceeded > > Somebody (Armin Ehrenfels) proposed to replace the regex by this one : > regex = re.compile(r'(<!--(?:[ ]|\s*(?!-->)(?:\S(?:.(?!-->))*.))-->)', re.M|re.S) > which works fine in CPython (even on a 11Mb file with more than 2000 matches) but still fails when > running on Jyhton (2 matches spread over 150 lines). > Is there a way to allows more recursion with Jython ? > Our test server has 8 Gb of RAM ... there might be a way to push it thru' ... ? Maybe tuning the VM with -Xss. Or even better, use a parser. Diez |