IMS Toucan
Controllable and fast Text-to-Speech for over 7000 languages
...It includes complete pipelines for preprocessing datasets, training models, and running inference, plus a storage configuration system to manage where models and caches are stored. IMS-Toucan ships with several ready-to-run scripts, including GUIs for interactive demos, prosody override tools, zero-shot language embedding injection, and text-to-audio file generation. Pretrained models are automatically downloaded when needed, and there is an online demo instance hosted on GPU that anyone can try.