Re: [mod-security-users] mod_security causing Apache 1.3.33 to ha ng
Brought to you by:
victorhora,
zimmerletw
|
From: Tom A. <tan...@oa...> - 2006-01-11 23:05:24
|
Ivan Ristic wrote: > Windows because of the smaller stack size. If I recall > correctly PCRE uses recursion for subexpressions internally, > which leads to stack space consumption when the regex > is applied to a long string. For performance reasons, all regular expressions should be simplified as much as possible. Under the wrong circumstances, they can end up using lots of resources. For instance, expressions should be greedy whenever possible. The expression /<.+>/ will match "<head>" but will also search "<head> blah blah blah blah blah ..." until the end of the string to determine if the ">" is a part of the "." or not. It will also match "<head><title>HTML Injection Attack</title></head>" even though it would be sufficient to stop at "<head>" if you're just trying to reject HTML tags of any kind. So a more efficient version that prevents all kinds of recursive backtracking would be the greedy one /<.+?>/. But still, any filter that looks for one or two characters followed by ".+" or even ".+?" is going to be a likely resource hog during false positives. To cut down on this, try to add as much detail to an expression as possible. Using character classes to reduce the set of characters that will match can both cut down on false positives and also significantly reduce the recursion on each string. For instance, if an HTML tag cannot start with a number, then using the expression /<\s*[^\d].+?>/ will prevent the regex engine from searching a term such as "if x < 5, then z = 0 blah blah blah...." all the way to the end of the string. We've added more detail before the ".+?" part. This might be a bad example since most HTML engines will just ignore a number at the beginning of a tag, but then again, an HTML tag -- being an enclosure of just about any size string -- is just too fungible to efficiently identify and flag with a filter directive anyway. Better instead would be to sanitize your input so that HTML tags are made impossible by escaping the tag symbols themselves. But you can't just do this for every input ever passed into Apache, as some maybe shouldn't be mutilated in this way if they're ultimately never going to be displayed on a web page. Ideally, the script that handles this input should do its own sanitizing. I'm not sure if you can use mod_security to do this, but maybe you can try something like: SecFilterSelective THE_REQUEST "vulnerable-script-name" chain SecFilterSelective ARG_SANITIZEME "(<|>)" "exec:html_escape.pl" But I don't think the exec'd script gets passed the info or inserts anything back into the string. Ideally "html_escape.pl" would be passed the "ARG_SANITIZEME" content on STDIN and then mod_security would replace "ARG_SANITIZEME" with the output of "html_escape.pl". That would be a true external filter, similar to how procmail works. Ivan, correct me if I'm wrong in saying that you can't do using mod_security what I'm suggesting would be the right technique. Actually, ideally you could do this: SecFilterSelective THE_REQUEST "vulnerable-script-name" chain SecFilterSelective ARG_SANITIZEME s/</</ SecFilterSelective ARG_SANITIZEME s/>/>/ But that too wouldn't work in mod_security I believe. Is this something that could be added in future versions? Or maybe even a new directive specifically for html escaping input? Something like: SecFilterSelective THE_REQUEST "vulnerable-script-name" chain SecFilterHTMLEscape ARG_SANITIZEME I think it would be extremely useful to be able to modify request content in this way rather than just flagging it. Tom |