...The model employs gated neural networks with recurrent processing blocks that disentangle voices over multiple computational steps, while maintaining speaker consistency across output channels. Separate models are trained for different speaker counts, and the largest-capacity model dynamically determines the actual number of speakers in a mixture. The repository includes all necessary scripts for training, dataset preparation, distributed training, evaluation, and audio separation.