Robust Speech Recognition via Large-Scale Weak Supervision
A Web UI for easy subtitle using whisper model
Multilingual Automatic Speech Recognition with word-level timestamps
An opinionated CLI to transcribe Audio files w/ Whisper on-device
A nearly-live implementation of OpenAI's Whisper
Comprehensive Gradio WebUI for audio processing
An Open Source text-to-speech system built by inverting Whisper
MCP server enabling AI agents to control and automate Windows OS
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
A Family of Open Sourced Music Foundation Models
A Telegram bot that integrates with OpenAI's official ChatGPT APIs
AI-powered tool for generating, optimizing, and translating subtitles
Unlimited, private and free Speech-To-Text program
Speech-AI-Forge is a project developed around TTS generation model
A Pythonic framework to simplify AI service building
Voice Recognition to Text Tool
Chat with it via text and voice
A python tool that uses GPT-4, FFmpeg, and OpenCV
Generate blog articles from video or audio
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Real time face swap and one-click video deepfake
GUI for a Vocal Remover that uses Deep Neural Networks
State-of-the-art 2D and 3D Face Analysis Project
Video-based AI memory library. Store millions of text chunks in MP4
Stable Diffusion web UI