GLM-4-Voice | End-to-End Chinese-English Conversational Model
Open-source multi-speaker long-form text-to-speech model
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Capable of understanding text, audio, vision, video
Two Integrated Text To Speech Engines uses MMS & Silero