Open Source Speech Language Model
Open-source multi-speaker long-form text-to-speech model
Audio foundation model excelling in audio understanding
Multi-modal large language model designed for audio understanding
A Conversational Speech Generation Model
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
CTC-based forced aligner for audio-text in 158 languages