Limiting bandwidth based on concurrent connections

Steve
2013-10-18
2013-11-03
  • Steve
    Steve
    2013-10-18

    I am currently doing this to limit bot traffic from killing a server

    # set the conditional variable to Spider
    BrowserMatch (slurp|googlebot|bingbot) Spider
    # limits the number of concurrent spider requests to 5 and bandwidth to 50kbytes/s
    QS_EventRequestLimit Spider 5
    QS_EventKBytesPerSecLimit Spider 50
    

    The above places a hard restriction of 5 concurrent accesses from search bots, and it means spiders are ALWAYS throttled - which is not desirable - I would prefer to do something like this:

    BrowserMatch (slurp|googlebot|bingbot) Spider
    QS_SetEnvIf Spider [Variable that gets set if concurrent Spider connections more than 5] Slowdown
    QS_EventKBytesPerSecLimit Slowdown 50
    

    This would not place a hard limit on the number of bots connecting at once, but it would slow them down considerably if there are more than 5 connecting at once.

    I just can't work out how to expose the current count of connections for which the variable "Spider" has been set. Any clues if this is even possible? I have read this bit:

    "*_Clear
    The counter of the variable processed by the QS_ClientEventLimitCount directive are reset if you set the same variable suffixed by _Clear, e.g. QS_Limit_Clear."

    but I can't figure out if this is relevant or how to use it syntactically. Thanks for any pointers - Great work!

     
    Last edit: Steve 2013-10-18
  • The newest version of mod_qos features the variable QS_IPConn.

    Example:

     QS_SrvMaxConnPerIP 100
     Browsermatch (slurp|googlebot|bingbot) Spider
     SetEnvIf QS_IPConn [5-9][0-9]* FiveOrMoreConnections
     QS_SetEnvIf FiveOrMoreConnections Spider LimitSpider=yes
     QS_EventKBytesPerSecLimit LimitSpider 50
    

    Pay attention to the fact, that the QS_IPConn is calculated when opening a new connection. That means that all requests on a connection are processed using the number of established connections when the connection has been started. Therefore, the request on the first 4 connections won't be limited by this configuration example.