EPLB
Expert Parallelism Load Balancer
EPLB is DeepSeek’s open implementation of a load balancing algorithm designed for expert parallelism (EP) settings in MoE architectures. In EP, different “experts” are mapped to different GPUs or nodes, so load imbalance becomes a performance bottleneck if certain experts are invoked much more often. EPLB solves this by duplicating heavily used experts (redundancy) and then placing those duplicates across GPUs to even out computational load. It uses policies like hierarchical load balancing...