Re: [mod-security-users] Performance woes - larger JSON payloads with CRS

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hey Michael,

You are correct. PCRE can be awfully slow. Alternative regex engines have less
functionality, but linear performance. That's why we are working hard to
drop the pcre specific functionality from CRS to make the rule compatible.

Meanwhile, the ModSec developers are also attempting to bring other engines
to ModSec, at least to ModSec3 AFAIK.

When doing this kind of research, it my be worth to check out msc_retest by
CRS dev Ervin Hegedüs / @airween:
https://coreruleset.org/20210106/introducing-msc_retest/

Best,

Christian

On Mon, Apr 26, 2021 at 09:25:19AM +0000, Michael Woods via mod-security-users wrote:
> I've had a similar experience with mod_security and CRS. Some of our JSON payloads can be 4MB. Given that the regular expressions executed by mod_security are Perl based, I extracted the regular expressions from the CRS and benchmarked them individually with varying sizes of JSON documents.
> 
> All CRS rules completed within 0.03s except for one rule. Rule 941160 when executed against large payloads can take multiple seconds to complete. A 1M payload takes around 2700 seconds to complete, without PCRE JIT enabled. Running PCRE with JIT compilation does have an impact but it still takes around 385 seconds to complete.
> Rule 941160 belongs to a collection of rules concerning XSS attacks.
> Doing some more digging I discovered Perl based regular expression engines such as Perl and PCRE have exponential response times when the data grows in size. I also was doing some work with Google's Go language and their regular expression engine re provides a linear response time. Luckily on Perl's CPAN repository, there is a module that provides an interface with re. Running the same benchmarks again, rule 941160 took less than a second. I had to run it again and check the results because I didn't believe it.
> Regards
> 
>     On Monday, 26 April 2021, 07:16:18 BST, Christian Folini <chr...@ne...> wrote:  
>  
>  Hey Henri,
> 
> >From a security practice, this is obviously lacking, but in wider perspective,
> I see it meet "industry standard", yes.
> 
> When I teach, I tell my student, that the worst WAF is the one that is
> switched off. So if you need to compromise and you can only apply 20% of
> the rules because you run the risk of business demanding it's switched off,
> then that 20% WAF is still better than no WAF.
> 
> Cheers,
> 
> Christian
> 
> 
> On Mon, Apr 26, 2021 at 06:57:54AM +0100, Henri Cook wrote:
> > Thanks Christian, taking this in combination with Osama's point earlier in
> > the thread that most 'big four' (AWS/GCP/Cloudflare/Azure) WAFs seem to
> > limit the payload they'll scan. From my reading to 128kb (cloudflare+azure)
> > or 8kb (aws+gcp) I think I'll be able to resolve our particular issue.
> > 
> > I believe my modern application is already very robust in terms of defence
> > against sql injection as well as other OWASP top 10 attack vectors and that
> > a WAF primarily adds reassurance (for the business and clients who ask if I
> > have one) and minor frustration (for any potential attacker) layer. The
> > spec is to add a WAF that meets (but notably does not necessarily have to
> > exceed) industry standards. I believe this means that I can switch modsec
> > to 128kb or 8kb partial parsing ('SecResponseBodyLimitAction
> > ProcessPartial' - allowing through unscanned any payloads over those sizes)
> > and be able to say I've got scan-size-policy-parity with an AWS or a
> > Cloudflare which means it is "industry standard".
> > 
> > Please let me know if you think that's mad and thanks again
> > 
> > Best Regards,
> > 
> > Henri
> > 
> > On Sun, 25 Apr 2021 at 21:39, Christian Folini <chr...@ne...>
> > wrote:
> > 
> > > Hey Henri,
> > >
> > > You are in a bad situation and as far as I can see you are right, you might
> > > have to drop modsec/CRS in this situation.
> > >
> > > I've had a customer with a similar problem and we did a deep dive
> > > investigation and I had to strike colors in the end.
> > >
> > > The point is not the JSON parser. That has shown to be really fast. The
> > > point
> > > is several hundred variables that go into CRS afterwards. If you run CRS
> > > on a
> > > standard web application you get forms with a few parameters and that's
> > > easy.
> > > But several megabytes of JSON means hundreds of arguments and CRS parses
> > > them
> > > all.
> > >
> > > So we tried to work with rule exclusions and skip the parameters we did not
> > > think dangerous, but here comes the bummer: ModSec 2.9 grew substantially
> > > slower the longer the ignore-lists of parameters became. This and a few
> > > very
> > > odd behaviors.
> > >
> > > Given the customer wanted a generic WAF without tuning of individual APIs
> > > we
> > > got to a dead end.
> > >
> > > However, if tuning was an option, then I would probably edit-CRS with
> > > msc_pyparser and replace the target lists with arguments I was interested
> > > in.
> > >
> > > https://coreruleset.org/20200901/introducing-msc_pyparser/
> > >
> > > As a complementary practice, one could think of performing allowlist
> > > checks on
> > > some / most of the JSON. Say you have a huge JSON payload with 500
> > > parameters.
> > > You examine it and discover that 300 of them actually contain simple digits
> > > and asciii characters and neither special chars nor escape sequences.
> > > So you do a regex allowlist and apply it to these 300 parameters of said
> > > API. And the rest you can push into CRS. Or a subset of CRS.
> > >
> > > I have not done this and the problem is if ModSec is able to handle the
> > > large
> > > target lists in a speedy manner.
> > >
> > >
> > > Now you can turn to a CDN or alternative WAF. I would do an extensive
> > > security
> > > tests of such a system. As I said, the JSON parser can be really fast. The
> > > difficult thing is to check several hundred parameters without losing
> > > performance.
> > >
> > > Good luck!
> > >
> > > Christian
> > >
> > >
> > > On Sun, Apr 25, 2021 at 08:47:06PM +0100, Henri Cook wrote:
> > > > Hi all,
> > > >
> > > > I'm in a situation where the only solution seems to be to drop modsec/CRS
> > > > and look at something like Cloudflare's WAF (and change our security
> > > model
> > > > out of necessity). I'm hoping the esteemed membership of this list might
> > > > have some thoughts.
> > > >
> > > > I've got about 1MB of JSON, payloads in our app might run to 20 or even
> > > > 30MB ultimately.
> > > > This 1MB of somewhat nested JSON (7 or 8 levels deep) can take 40 seconds
> > > > to process in mod sec 3.0.4 with CRS 3.2.0
> > > >
> > > > It takes 1 second to process in our API so the WAF element is a 39x slow
> > > > down. I appreciate there'll be some delays in WAF. Cloudflare's WAF
> > > takes 5
> > > > seconds to scan this payload - and that's my target.
> > > >
> > > > Has anyone got any idea how to improve performance? Reading blog posts
> > > > about the development of cloudflare's waf I see that memoization of
> > > common
> > > > function calls was one of their absolute best performance improvements
> > > over
> > > > their modsec implementation (e.g. strlen(response_body) so it's only
> > > > calculated once instead of once per rule OR contains('somestring',
> > > > response_body)... you get the drift). Do we have anything like this in
> > > > modsec today? Is that already in place and my 39 seconds is after that?
> > > >
> > > > I appreciate that mod sec is fast on its own and adding complex rules can
> > > > be said to slow it down. With CRS being by far the most common use case
> > > for
> > > > mod sec (based on my googling) I'm surprised it's this slow, do you think
> > > > i've missed something?
> > > >
> > > > To note: I'm only scanning JSON payloads, typically much less than 0.5MB
> > > > but new, irregular ones that we need scanned in ideally <10 seconds that
> > > > can range from 1MB-30MB
> > > >
> > > > Best regards,
> > > >
> > > > Henri Cook
> > >
> > >
> > > > _______________________________________________
> > > > mod-security-users mailing list
> > > > mod...@li...
> > > > https://lists.sourceforge.net/lists/listinfo/mod-security-users
> > > > Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
> > > > http://www.modsecurity.org/projects/commercial/rules/
> > > > http://www.modsecurity.org/projects/commercial/support/
> > >
> > >
> > >
> > > _______________________________________________
> > > mod-security-users mailing list
> > > mod...@li...
> > > https://lists.sourceforge.net/lists/listinfo/mod-security-users
> > > Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
> > > http://www.modsecurity.org/projects/commercial/rules/
> > > http://www.modsecurity.org/projects/commercial/support/
> > >
> 
> 
> > _______________________________________________
> > mod-security-users mailing list
> > mod...@li...
> > https://lists.sourceforge.net/lists/listinfo/mod-security-users
> > Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
> > http://www.modsecurity.org/projects/commercial/rules/
> > http://www.modsecurity.org/projects/commercial/support/
> 
> 
> 
> _______________________________________________
> mod-security-users mailing list
> mod...@li...
> https://lists.sourceforge.net/lists/listinfo/mod-security-users
> Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
> http://www.modsecurity.org/projects/commercial/rules/
> http://www.modsecurity.org/projects/commercial/support/
>   

> _______________________________________________
> mod-security-users mailing list
> mod...@li...
> https://lists.sourceforge.net/lists/listinfo/mod-security-users
> Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs:
> http://www.modsecurity.org/projects/commercial/rules/
> http://www.modsecurity.org/projects/commercial/support/