State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Video translation and dubbing tool powered by LLMs
Capable of understanding text, audio, vision, video
Robust Speech Recognition via Large-Scale Weak Supervision
SOTA Open Source TTS
A library for audio and music analysis, feature extraction
Convert files and web content into clean, usable Markdown easily
Oobabooga - The definitive Web UI for local AI, with powerful features
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Self-hosted AI audio transcription
Generate music based on natural language prompts using LLMs
Local-first AI Notepad for Private Meetings
Data manipulation and transformation for audio signal processing
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Framework for building real-time voice and multimodal AI agents
Unified web UI for training and running open models locally
Offline Text To Speech synthesis for python
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Open Source Speech Language Model
AI tool converting video/audio into structured documents instantly
Give Claude the ability to watch and understand videos
Generate blog articles from video or audio
Manage Claude Code in style
A Systematic Framework for Interactive World Modeling
Towards Human-Sounding Speech