A theoretical reconstruction of the Claude Mythos architecture
Fast stable diffusion on CPU and AI PC
Use Microsoft Edge's online text-to-speech service from Python
SoTA open-source TTS
1 min voice data can also be used to train a good TTS model
Open-Sora: Democratizing Efficient Video Production for All
Comprehensive Gradio WebUI for audio processing
Official inference repo for FLUX.2 models
Open-source AI agent framework
Generate short videos with one click using AI LLM
High-Quality Voice Cloning TTS for 600+ Languages
Advanced language and coding AI model
Everything you need to build state-of-the-art foundation models
A lightweight audio-to-MIDI converter with pitch bend detection
Unofficial Python API and agentic skill for Google NotebookLM
A set of ready to use Agent Skills for research, science, engineering
Automatic Speech Recognition with Word-level Timestamps
Machine learning in Python
Python inference and LoRA trainer package for the LTX-2 audio–video
Instant voice cloning by MIT and MyShell. Audio foundation model
Chat with your documents using local AI
Open-source multi-speaker long-form text-to-speech model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Qwen3-TTS is an open-source series of TTS models