I am currently doing this to limit bot traffic from killing a server
# set the conditional variable to Spider
BrowserMatch (slurp|googlebot|bingbot) Spider
# limits the number of concurrent spider requests to 5 and bandwidth to 50kbytes/s
QS_EventRequestLimit Spider 5
QS_EventKBytesPerSecLimit Spider 50
The above places a hard restriction of 5 concurrent accesses from search bots, and it means spiders are ALWAYS throttled - which is not desirable - I would prefer to do something like this:
BrowserMatch (slurp|googlebot|bingbot) Spider
QS_SetEnvIf Spider [Variable that gets set if concurrent Spider connections more than 5] Slowdown
QS_EventKBytesPerSecLimit Slowdown 50
This would not place a hard limit on the number of bots connecting at once, but it would slow them down considerably if there are more than 5 connecting at once.
I just can't work out how to expose the current count of connections for which the variable "Spider" has been set. Any clues if this is even possible? I have read this bit:
"*_Clear
The counter of the variable processed by the QS_ClientEventLimitCount directive are reset if you set the same variable suffixed by _Clear, e.g. QS_Limit_Clear."
but I can't figure out if this is relevant or how to use it syntactically. Thanks for any pointers - Great work!
Last edit: Steve 2013-10-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Pay attention to the fact, that the QS_IPConn is calculated when opening a new connection. That means that all requests on a connection are processed using the number of established connections when the connection has been started. Therefore, the request on the first 4 connections won't be limited by this configuration example.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am currently doing this to limit bot traffic from killing a server
The above places a hard restriction of 5 concurrent accesses from search bots, and it means spiders are ALWAYS throttled - which is not desirable - I would prefer to do something like this:
This would not place a hard limit on the number of bots connecting at once, but it would slow them down considerably if there are more than 5 connecting at once.
I just can't work out how to expose the current count of connections for which the variable "Spider" has been set. Any clues if this is even possible? I have read this bit:
but I can't figure out if this is relevant or how to use it syntactically. Thanks for any pointers - Great work!
Last edit: Steve 2013-10-18
The newest version of mod_qos features the variable QS_IPConn.
Example:
Pay attention to the fact, that the QS_IPConn is calculated when opening a new connection. That means that all requests on a connection are processed using the number of established connections when the connection has been started. Therefore, the request on the first 4 connections won't be limited by this configuration example.