Thread: [mod-security-users] Collection db file growing at alarming rate
Brought to you by:
victorhora,
zimmerletw
From: Mark M. <mos...@gm...> - 2012-06-20 19:09:56
|
This is a "Just checking if this is normal" post. This may also be a "You're doing it wrong" post :) I've got a collection on a fairly busy server, running apache 2.2.22 with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL for ratelimiting. I'm constructing the IP collection like this: SecRule REQUEST_URI "^(.*)$" "phase:1,pass,nolog,t:none,t:md5,t:hexEncode,capture,id:99000" SecRule TX:1 "^(.{8})" "phase:1,pass,nolog,t:none,capture,setvar:tx.requri_hash=%{TX.1},id:99001" SecRule REMOTE_ADDR "@rx ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$" "capture,setvar:tx.fwdhost=%{TX.1},phase:1,id:99910,nolog,pass" SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" "chain,initcol:IP=%{TX.fwdhost}:%{TX.requri_hash},phase:1,id:99911,nolog,pass" SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" "chain,phase:1,setvar:IP.req_count_60s=+1,expirevar:IP.req_count_60s=60,setvar:IP.req_count_1800s=+1,expirevar:IP.req_count_1800s=1800,nolog,pass,id:99930" SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" (Leaving out the part where it checks against the collection and does a 'deny') So the IP collection key is a combination of the IP (with the 10./8 filtered out) +the first 8 chars in the MD5 hash of the URL (Saving the whole URL seemed like it could be way too big). I'm expiring on both a 60 second window and a 3600 second window. BTW, there's a good but unrelated reason why I'm doing @rx against REMOTE_ADDR; I realize it'd normally be unnecessary. First off, the collection seems to be working ok. I wrote a little perl script to dump the SDBM file and it's collecting what looks right. My issue is that the ip.pag file grows insanely quickly. For example, in 9 minutes, it had grown to 4 gig. In an earlier run (with the SDBM file deleted between all tests), it made it to 8 gig in about 15 minutes. In the 4 gig file, the perl script (which might consequently be wrong itself) said there was 13359 entries. That's 360k per entry, which seems a bit large. Is it storing a lot of unprintable data in each record, like a detailed record of every request? This is admittedly probably over-simplistic, but iterating over the tied hash from the SDBM file in Perl and adding up the record sizes as well as the sizes of the keys, I get a measly 4.7 meg. I'd expect a good deal of per-entry overhead in the db, but that's off by almost 1000x. I saw in MODSEC-160 that 1800 is probably a bit high, so I can lower that (or just have the 60s one). But if my file is hitting that size in 9 minutes, then 1800 seconds is probably irrelevant here. I had originally just did a 60 second expiration time and even with just that (i.e. no additional 1800 second expiration), the file was growing very large, very quickly, though obviously not as rapidly as with the additional 1800s expiration, but growing quickly enough to see unwarranted by the number of records in the SDBM file. So: a) Am I doing something insanely stupid here? That'd be my preference. This might also include a poor conceptual grasp of what goes into the SDBM file. b) Am I abusing the IP collection, and if so, is there somewhere else I should be storing this? c) Am I hitting a bug of some sort? The release notes for >=2.6.4 didn't mention any fixes relating to collections, so I haven't tried upgrading yet. d) Any hints on how I could be doing this better are highly appreciated. Relevant version info: [Wed Jun 20 14:17:09 2012] [notice] ModSecurity for Apache/2.6.3 (http://www.modsecurity.org/) configured. [Wed Jun 20 14:17:09 2012] [notice] ModSecurity: APR compiled version="1.4.5"; loaded version="1.4.5" [Wed Jun 20 14:17:09 2012] [notice] ModSecurity: PCRE compiled version="7.6"; loaded version="8.02 2010-03-19" [Wed Jun 20 14:17:09 2012] [warn] ModSecurity: Loaded PCRE do not match with compiled! [Wed Jun 20 14:17:09 2012] [notice] ModSecurity: LUA compiled version="Lua 5.1" [Wed Jun 20 14:17:09 2012] [notice] ModSecurity: LIBXML compiled version="2.6.32" [Wed Jun 20 14:17:10 2012] [notice] Apache/2.2.22 (Unix) mod_ssl/2.2.22 OpenSSL/0.9.8o mod_apreq2-20090110/2.8.0 mod_perl/2.0.5 Perl/v 5.10.1 configured -- resuming normal operations Thanks! |
From: Reindl H. <h.r...@th...> - 2012-06-20 19:20:54
Attachments:
signature.asc
|
Am 20.06.2012 21:09, schrieb Mark Moseley: > This is a "Just checking if this is normal" post. This may also be a > "You're doing it wrong" post :) > > I've got a collection on a fairly busy server, running apache 2.2.22 > with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL > for ratelimiting. I'm constructing the IP collection like this oh no - do not implement DOS-protection on the apache-level this is completly wrong and will not work at real attacks "iptables" can do this much better on a much lower level cat /etc/modprobe.d/iptables-recent.conf options ipt_recent ip_list_tot=10000 ip_pkt_list_tot=200 _____________________________________________________________ 0 0 DROP udp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE seconds: 2 hit_count: 70 name: udpflood side: source 0 0 DROP tcp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE seconds: 2 hit_count: 150 name: DEFAULT side: source |
From: Ryan B. <RBa...@tr...> - 2012-06-20 19:30:26
|
On 6/20/12 3:09 PM, "Mark Moseley" <mos...@gm...> wrote: >This is a "Just checking if this is normal" post. This may also be a >"You're doing it wrong" post :) > >I've got a collection on a fairly busy server, running apache 2.2.22 >with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL >for ratelimiting. I'm constructing the IP collection like this: > >SecRule REQUEST_URI "^(.*)$" >"phase:1,pass,nolog,t:none,t:md5,t:hexEncode,capture,id:99000" >SecRule TX:1 "^(.{8})" >"phase:1,pass,nolog,t:none,capture,setvar:tx.requri_hash=%{TX.1},id:99001" > >SecRule REMOTE_ADDR "@rx ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$" >"capture,setvar:tx.fwdhost=%{TX.1},phase:1,id:99910,nolog,pass" >SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" >"chain,initcol:IP=%{TX.fwdhost}:%{TX.requri_hash},phase:1,id:99911,nolog,p >ass" >SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > >SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" >"chain,phase:1,setvar:IP.req_count_60s=+1,expirevar:IP.req_count_60s=60,se >tvar:IP.req_count_1800s=+1,expirevar:IP.req_count_1800s=1800,nolog,pass,id >:99930" >SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > >(Leaving out the part where it checks against the collection and does a >'deny') > >So the IP collection key is a combination of the IP (with the 10./8 >filtered out) +the first 8 chars in the MD5 hash of the URL (Saving >the whole URL seemed like it could be way too big). I'm expiring on >both a 60 second window and a 3600 second window. BTW, there's a good >but unrelated reason why I'm doing @rx against REMOTE_ADDR; I realize >it'd normally be unnecessary. > >First off, the collection seems to be working ok. I wrote a little >perl script to dump the SDBM file and it's collecting what looks >right. My issue is that the ip.pag file grows insanely quickly. For >example, in 9 minutes, it had grown to 4 gig. In an earlier run (with >the SDBM file deleted between all tests), it made it to 8 gig in about >15 minutes. > >In the 4 gig file, the perl script (which might consequently be wrong >itself) said there was 13359 entries. That's 360k per entry, which >seems a bit large. Is it storing a lot of unprintable data in each >record, like a detailed record of every request? This is admittedly >probably over-simplistic, but iterating over the tied hash from the >SDBM file in Perl and adding up the record sizes as well as the sizes >of the keys, I get a measly 4.7 meg. I'd expect a good deal of >per-entry overhead in the db, but that's off by almost 1000x. > >I saw in MODSEC-160 that 1800 is probably a bit high, so I can lower >that (or just have the 60s one). But if my file is hitting that size >in 9 minutes, then 1800 seconds is probably irrelevant here. I had >originally just did a 60 second expiration time and even with just >that (i.e. no additional 1800 second expiration), the file was growing >very large, very quickly, though obviously not as rapidly as with the >additional 1800s expiration, but growing quickly enough to see >unwarranted by the number of records in the SDBM file. > >So: >a) Am I doing something insanely stupid here? That'd be my preference. >This might also include a poor conceptual grasp of what goes into the >SDBM file. >b) Am I abusing the IP collection, and if so, is there somewhere else >I should be storing this? >c) Am I hitting a bug of some sort? The release notes for >=2.6.4 >didn't mention any fixes relating to collections, so I haven't tried >upgrading yet. >d) Any hints on how I could be doing this better are highly appreciated. Mark, 1) The persistent storage files are sparse data files. Run a "du -b" vs. "ls -l" and see what the difference is. 2) Have you reviewed the DoS/Brute Force rules in the OWASP CRS? These may have the functionality you want. 3) You could also consider using the SecGuardianLog directive with the httpd-guardian.pl script - http://mod-security.svn.sourceforge.net/viewvc/mod-security/crs/trunk/util/ httpd-guardian.pl?revision=1961&content-type=text%2Fplain This monitors the same data going to the apache logs and can fire off commands to IPTables to blacklist source IP addresses at a lower level. -- Ryan Barnett Trustwave SpiderLabs ModSecurity Project Leader OWASP ModSecurity CRS Project Leader This transmission may contain information that is privileged, confidential, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. |
From: Mark M. <mos...@gm...> - 2012-06-20 20:42:40
|
> Mark, > 1) The persistent storage files are sparse data files. Run a "du -b" vs. > "ls -l" and see what the difference is. That'd explain that. I didn't even consider that possibility. Any performance considerations there? If I end up with a 100 gig sparse file, will it make modsec unhappy in any sort of way? > 2) Have you reviewed the DoS/Brute Force rules in the OWASP CRS? These > may have the functionality you want. I'll definitely check that out. > 3) You could also consider using the SecGuardianLog directive with the > httpd-guardian.pl script - > http://mod-security.svn.sourceforge.net/viewvc/mod-security/crs/trunk/util/ > httpd-guardian.pl?revision=1961&content-type=text%2Fplain > > This monitors the same data going to the apache logs and can fire off > commands to IPTables to blacklist source IP addresses at a lower level. I haven't but I'll check that out too. Iptables unfortunately doesn't help me here since connections aren't really coming from the IPs I want to block. They're actually coming from our Akamai proxies and we use a combo of mod_perl and mod_rpaf2 to set them back. I didn't want to distract from the config I'd posted above, but it's actually testing against ENV:REAL_REMOTE_HOST, which I'm setting in mod_perl (thus the extra precaution of the @rx test). |
From: Breno S. <bre...@gm...> - 2012-06-20 19:45:07
|
Mark, Just two recommendations. You are running modsecurity with different library versions from apache, it could cause crash. Also i saw you are not using SESSION and USER collections. However would be a good idea upgrade to a newer modsecurity version because a problem related to that collections were fixed more recently. You can also check SecCollectionTimeout to define a lower timeout value for your collections. Thanks Breno On Wed, Jun 20, 2012 at 2:30 PM, Ryan Barnett <RBa...@tr...>wrote: > > On 6/20/12 3:09 PM, "Mark Moseley" <mos...@gm...> wrote: > > >This is a "Just checking if this is normal" post. This may also be a > >"You're doing it wrong" post :) > > > >I've got a collection on a fairly busy server, running apache 2.2.22 > >with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL > >for ratelimiting. I'm constructing the IP collection like this: > > > >SecRule REQUEST_URI "^(.*)$" > >"phase:1,pass,nolog,t:none,t:md5,t:hexEncode,capture,id:99000" > >SecRule TX:1 "^(.{8})" > >"phase:1,pass,nolog,t:none,capture,setvar:tx.requri_hash=%{TX.1},id:99001" > > > >SecRule REMOTE_ADDR "@rx ^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$" > >"capture,setvar:tx.fwdhost=%{TX.1},phase:1,id:99910,nolog,pass" > >SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > >"chain,initcol:IP=%{TX.fwdhost}:%{TX.requri_hash},phase:1,id:99911,nolog,p > >ass" > >SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > > > >SecRule REMOTE_ADDR "!@rx ^10\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > >"chain,phase:1,setvar:IP.req_count_60s=+1,expirevar:IP.req_count_60s=60,se > >tvar:IP.req_count_1800s=+1,expirevar:IP.req_count_1800s=1800,nolog,pass,id > >:99930" > >SecRule REMOTE_ADDR "@rx ^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" > > > >(Leaving out the part where it checks against the collection and does a > >'deny') > > > >So the IP collection key is a combination of the IP (with the 10./8 > >filtered out) +the first 8 chars in the MD5 hash of the URL (Saving > >the whole URL seemed like it could be way too big). I'm expiring on > >both a 60 second window and a 3600 second window. BTW, there's a good > >but unrelated reason why I'm doing @rx against REMOTE_ADDR; I realize > >it'd normally be unnecessary. > > > >First off, the collection seems to be working ok. I wrote a little > >perl script to dump the SDBM file and it's collecting what looks > >right. My issue is that the ip.pag file grows insanely quickly. For > >example, in 9 minutes, it had grown to 4 gig. In an earlier run (with > >the SDBM file deleted between all tests), it made it to 8 gig in about > >15 minutes. > > > >In the 4 gig file, the perl script (which might consequently be wrong > >itself) said there was 13359 entries. That's 360k per entry, which > >seems a bit large. Is it storing a lot of unprintable data in each > >record, like a detailed record of every request? This is admittedly > >probably over-simplistic, but iterating over the tied hash from the > >SDBM file in Perl and adding up the record sizes as well as the sizes > >of the keys, I get a measly 4.7 meg. I'd expect a good deal of > >per-entry overhead in the db, but that's off by almost 1000x. > > > >I saw in MODSEC-160 that 1800 is probably a bit high, so I can lower > >that (or just have the 60s one). But if my file is hitting that size > >in 9 minutes, then 1800 seconds is probably irrelevant here. I had > >originally just did a 60 second expiration time and even with just > >that (i.e. no additional 1800 second expiration), the file was growing > >very large, very quickly, though obviously not as rapidly as with the > >additional 1800s expiration, but growing quickly enough to see > >unwarranted by the number of records in the SDBM file. > > > >So: > >a) Am I doing something insanely stupid here? That'd be my preference. > >This might also include a poor conceptual grasp of what goes into the > >SDBM file. > >b) Am I abusing the IP collection, and if so, is there somewhere else > >I should be storing this? > >c) Am I hitting a bug of some sort? The release notes for >=2.6.4 > >didn't mention any fixes relating to collections, so I haven't tried > >upgrading yet. > >d) Any hints on how I could be doing this better are highly appreciated. > > Mark, > 1) The persistent storage files are sparse data files. Run a "du -b" vs. > "ls -l" and see what the difference is. > 2) Have you reviewed the DoS/Brute Force rules in the OWASP CRS? These > may have the functionality you want. > 3) You could also consider using the SecGuardianLog directive with the > httpd-guardian.pl script - > http://mod-security.svn.sourceforge.net/viewvc/mod-security/crs/trunk/util/ > httpd-guardian.pl?revision=1961&content-type=text%2Fplain > > This monitors the same data going to the apache logs and can fire off > commands to IPTables to blacklist source IP addresses at a lower level. > > -- > Ryan Barnett > Trustwave SpiderLabs > ModSecurity Project Leader > OWASP ModSecurity CRS Project Leader > > > This transmission may contain information that is privileged, > confidential, and/or exempt from disclosure under applicable law. If you > are not the intended recipient, you are hereby notified that any > disclosure, copying, distribution, or use of the information contained > herein (including any reliance thereon) is STRICTLY PROHIBITED. If you > received this transmission in error, please immediately contact the sender > and destroy the material in its entirety, whether in electronic or hard > copy format. > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > mod-security-users mailing list > mod...@li... > https://lists.sourceforge.net/lists/listinfo/mod-security-users > Commercial ModSecurity Rules and Support from Trustwave's SpiderLabs: > http://www.modsecurity.org/projects/commercial/rules/ > http://www.modsecurity.org/projects/commercial/support/ > |
From: Mark M. <mos...@gm...> - 2012-06-20 20:45:00
|
On Wed, Jun 20, 2012 at 12:45 PM, Breno Silva <bre...@gm...> wrote: > Mark, > > Just two recommendations. You are running modsecurity with different library > versions from apache, it could cause crash. > Also i saw you are not using SESSION and USER collections. However would be > a good idea upgrade to a newer modsecurity version because a problem related > to that collections were fixed more recently. > > You can also check SecCollectionTimeout to define a lower timeout value for > your collections. Good to know. Incidentally, is there somewhere else better besides IP that I could be (ab)using to store the IP+URI-hash pair? IIRC I'd seen in a comment of yours on the list from a while back that arbitrarily named collections were a possibility for the future. Is that still the case? |
From: Mark M. <mos...@gm...> - 2012-06-20 20:33:55
|
On Wed, Jun 20, 2012 at 12:20 PM, Reindl Harald <h.r...@th...> wrote: > > > Am 20.06.2012 21:09, schrieb Mark Moseley: >> This is a "Just checking if this is normal" post. This may also be a >> "You're doing it wrong" post :) >> >> I've got a collection on a fairly busy server, running apache 2.2.22 >> with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL >> for ratelimiting. I'm constructing the IP collection like this > > oh no - do not implement DOS-protection on the apache-level > this is completly wrong and will not work at real attacks > > "iptables" can do this much better on a much lower level > > cat /etc/modprobe.d/iptables-recent.conf > options ipt_recent ip_list_tot=10000 ip_pkt_list_tot=200 > _____________________________________________________________ > > > 0 0 DROP udp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE > seconds: 2 hit_count: 70 name: udpflood side: source > 0 0 DROP tcp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE > seconds: 2 hit_count: 150 name: DEFAULT side: source I'm pretty familiar with iptables ddos solutions. For this particular exercise (and I'm basically only at PoC stage now), I need to track the combination of IP and URI. It's not uncommon for a single IP to hit us a lot (like a googlebot) but it isn't normal for the same IP to hit the same URI 1000 times. Plus we are fronted by akamai, so the actual TCP connection is coming from them anyway ... and blocking Akamai's IPs would be bad :) |
From: Reindl H. <h.r...@th...> - 2012-06-20 20:56:27
Attachments:
signature.asc
|
Am 20.06.2012 22:33, schrieb Mark Moseley: > On Wed, Jun 20, 2012 at 12:20 PM, Reindl Harald <h.r...@th...> wrote: >> >> >> Am 20.06.2012 21:09, schrieb Mark Moseley: >>> This is a "Just checking if this is normal" post. This may also be a >>> "You're doing it wrong" post :) >>> >>> I've got a collection on a fairly busy server, running apache 2.2.22 >>> with modsec 2.6.3, Debian Squeeze 32-bit. I'm trying to track IP+URL >>> for ratelimiting. I'm constructing the IP collection like this >> >> oh no - do not implement DOS-protection on the apache-level >> this is completly wrong and will not work at real attacks >> >> "iptables" can do this much better on a much lower level >> >> cat /etc/modprobe.d/iptables-recent.conf >> options ipt_recent ip_list_tot=10000 ip_pkt_list_tot=200 >> _____________________________________________________________ >> >> >> 0 0 DROP udp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE >> seconds: 2 hit_count: 70 name: udpflood side: source >> 0 0 DROP tcp -- eth0 * !192.168.2.0/24 0.0.0.0/0 state NEW recent: UPDATE >> seconds: 2 hit_count: 150 name: DEFAULT side: source > > I'm pretty familiar with iptables ddos solutions. For this particular > exercise (and I'm basically only at PoC stage now), I need to track > the combination of IP and URI. It's not uncommon for a single IP to > hit us a lot (like a googlebot) but it isn't normal for the same IP to > hit the same URI 1000 times. it is uncommon that googlebot would ever exceed rate-limits they are aware of target-ip and do all they can not acting like crazy to the same ip independent how many vhosts the problem with DOS-protection on apache-layer is that you estasblish a full connection, the connection is using a worker process and any rate-control on apache-layer is because of this pretty useless if a real attack happens i was a short time ago target of a REAl attack and in this case even iptables can not help you - but it takes a little pressure from the application-layer which would be dead long before the next problem is how much ressources a protection itself consumes - you can be sure httpd+modsec is many times expensiver as any protection on a lower layer forget it - on a DOS attack modsec as protection may make things even worser |