Multilingual speech recognition and audio understanding model
A nearly-live implementation of OpenAI's Whisper
Virtual Python environment builder
Swing Music is a beautiful, self-hosted music player
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Comprehensive Gradio WebUI for audio processing
Multimodal Diffusion with Representation Alignment
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Instant voice cloning by MIT and MyShell. Audio foundation model
Share interesting, entry-level open source projects on GitHub
Capable of understanding text, audio, vision, video
AI video generator optimized for low VRAM and older GPUs use
Free, high-quality text-to-speech API endpoint to replace OpenAI
Implementation of AudioLM audio generation model in Pytorch
AudioMuse-AI is an Open Source Dockerized environment
Automatically translates the text of a video based on a subtitle file
Speakr is a personal, self-hosted web application
Streaming Real-time Audio-Driven Avatar Generation
A free, online learning platform to make quality education accessible
Oobabooga - The definitive Web UI for local AI, with powerful features
Fast multimodal LLM for real-time voice interaction and AI apps
Offline Text To Speech synthesis for python
The music player of today
Convert Python notebook to web app and share with non-technical users
SOTA Open Source TTS