Robust Speech Recognition via Large-Scale Weak Supervision
Industrial-level controllable zero-shot text-to-speech system
Open-source industrial-grade ASR models
End-to-end speech processing toolkit
A Conversational Speech Generation Model
Data manipulation and transformation for audio signal processing
Singing Voice Synthesis via Shallow Diffusion Mechanism
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Toolkit for efficient experimentation with Speech Recognition
Beamforming and Speech Recognition Toolkit