Matthew Patton - 2013-07-13

I saw something very much like this being deployed world-wide in 2010 by the world's largest CDN. About the only difference I can remember was that a given host accepted traffic based on a bit-mask of the source IP and also had a vIP membership mask and thus never any need to do connection-tracking. It was simple and brutally fast. I don't remember if there was much concern over hot-spots in the final quad of the source address but I'm sure that could be be tweaked dynamically by periodic sampling.

Say you had 4 hosts in a cluster and 1 vIP, you would partition the 255 source IPs into say groups of 16 contiguous and then farm those across the 4 in some fashion. If the block was too hot, then you could chop it in half and move that piece of load to another host.

Health and load-checks would periodically add/subtract masks as well add or delete vIP participation.

Then next make sure we play nice with Google's 2.6.35+ Linux kernel enhancements WRT incoming network traffic load-spreading features: Receive Packet Steering (RPS) and Receive Flow Steering (RFS)