TTS with kokoro and onnx runtime
MARS5 speech model (TTS) from CAMB.AI
Chat with it via text and voice
TTS model capable of streaming conversational audio in realtime
Use Microsoft Edge's online text-to-speech service from Python
Robust Speech Recognition via Large-Scale Weak Supervision
Open-source framework for intelligent speech interaction
Management of Yandex Station and other smart home devices
The official Python library for the Fish Audio API
Offline inference engine for art, real-time voice conversations
Repo of Qwen2-Audio chat & pretrained large audio language model
SOTA Open Source TTS
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
SoTA open-source TTS
Flowly is 100x faster than OpenClaw
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Context-aware desktop AI assistant that understands screen content
Voice Recognition to Text Tool
Framework for building AI-powered interactive digital humans and agent
An Open Source text-to-speech system built by inverting Whisper
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
A nearly-live implementation of OpenAI's Whisper
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
World's first open-source, agentic video production system
A Web UI for easy subtitle using whisper model