GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
Features
- Zero‑shot TTS: generate speech from a 5‑second voice sample
- Few‑shot fine-tuning: 1 minute of data for improved voice likeness
- Cross-lingual support across multiple languages
- Web UI for inference and batch generation
- Open-source with pretrained model weights
- Active community and publication‑grade performance
Categories
Voice CloningLicense
MIT LicenseFollow GPT-SoVITS
Other Useful Business Software
MongoDB Atlas runs apps anywhere
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of GPT-SoVITS!