Controllable & emotion-expressive zero-shot TTS
SOTA Open Source TTS
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
A fast TTS architecture with conditional flow matching
Easy-to-use Speech Toolkit including Self-Supervised Learning model
scans a given textual string in 146 pen on paper possible combinations
Multi-Voice and Prompt-Controlled TTS Engine
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Conditional Variational Autoencoder with Adversarial Learning
Implementation of a Transformer based neural network
Platform of neural models for natural language processing
Converting any Vietnamese word in grapheme to phoneme
ProseVis is a visualization tool for analyzing the sound of text.
CTC-based forced aligner for audio-text in 158 languages