A nearly-live implementation of OpenAI's Whisper
Chinese Llama-3 LLMs) developed from Meta Llama 3
A Customizable Image-to-Video Model based on HunyuanVideo
Towards Human-Sounding Speech
Lightning-fast, on-device TTS, running natively via ONNX
Interface for OuteTTS models
A computer vision framework to create and deploy apps in minutes
UME is an in-app debug kits platform for Flutter
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Generative Adversarial Networks for Efficient and High Fidelity Speech
DarkForest, the Facebook Go engine
A fast GPU accelerated feature extraction software for speech analysis
Software Architecture for Cognitive Robotics
Dia-1.6B generates lifelike English dialogue and vocal expressions
Tiny pre-trained IBM model for multivariate time series forecasting