Re: [Mod-security-developers] Fwd: Performance improvement plan
Brought to you by:
victorhora,
zimmerletw
|
From: Brian R. <bre...@gm...> - 2009-08-25 21:28:35
|
On Tue, Aug 25, 2009 at 1:38 PM, Ivan Ristic<iva...@gm...> wrote: > Forwarding message, as the original did not show in the archives. > > Is it getting through? I do not see any since Jan 2009. https://sourceforge.net/mailarchive/forum.php?forum_name=mod-security-developers > > > ---------- Forwarded message ---------- > From: Ivan Ristic <iva...@gm...> > Date: Fri, Aug 21, 2009 at 5:54 PM > Subject: Performance improvement plan > To: mod...@li... > > > I am planning to improve the performance of the ModSecurity rule > engine. I've reviewed the source code and come up with the following > ideas: > > 1. Avoid table lookup currently used to determine if a rule is > multiMatch. I don't suppose this is going to result with a significant > improvement, but it's been getting on my nerves for years now and I > want to get rid of it. I plan to add a variable to the rule structure > and set it at configuration time. Before I do anything, however, I > will investigate how SecRuleUpdateActionById interacts with action > lists. Agreed. > > 2. Change the individual rule processing code to avoid discovering > tfns at runtime. At the moment we are re-discovering tfns for every > variable, even though the action list never changes. Resolving the > tfns at configuration time should result a significant performance > improvement. You will need to re-build the list from SecRuleUpdateActionById at config time, which should be fine. Just a stored list of tfn structs would be fine instead of building the list at runtime. It should have been like that to begin with ;) The whole runtime "rule" structure needs to be direct-access instead of lookups, so you might as well remove all the lookups. > > 3. Cache entire target lists. Currently a lot of time is spent > resolving complex variables, adding and removing variables from the > target list, and transforming them in the end. I want to identify the > rules that share target lists, cache a reusable target list when it is > created for the first time, only to reuse it later. The cost of each > rule in the engine should go down a lot. In the worst case scenario we > waste a hash lookup (per rule). This should be countered by the > removal of the multiMatch lookup ;) Only issue I see here is with non-cacheable targets and targets that are listed in different orders. We need a good way to "hash" the target list so that any rule that uses the same targets (at least the same cacheable targets and in any order) resolves to the same hashed value. I had thought about using the ptr value of the tfn structure as this is unique per tfn and it does not have to survive a restart. So, you have to pull the cached values, then merge in and transform any of the non-cacheable targets. i would prefer if target order is maintained, which gives the rule writer some flexibility for optimizing the most likely targets first., but I am not sure that is a big deal. > > I have a couple of other smaller things I may look into, but I suspect > the improvements from the above 3 changes will make me happy enough. One other *big* improvement that is needed is how SecAction/SecMarker work as well as skipping. This was meant to just skip a few rules, but it is now being used to skip entire sections of rules and this is *very* slow. We need a way to do a real skipAfter without nop'ing each rule in between. This has to also handle rule removal/updating at config time *and* runtime, which is where it gets tricky. Originally I was thinking a linked list with a pointer to the "next rule" to run on a match success/failure. But the accounting is tricky with ctl:ruleRemovebyId. > > The plan is to benchmark with the available rule sets. > Sounds great. It may also show some need for optimization in the rulesets as well. Any plan for how you will be doing the benchmark? > The only question is what to branch and when? I am hoping for the > improvements to be significant enough to warrant a middle digit bump > and I'd like to see them go into 2.6.0. The trunk is 2.6 (will be). Work there as nothing but 2.5.x merges are getting commited there currently. thanks, -B |