A TTS model capable of generating ultra-realistic dialogue
Label Studio is a multi-type data labeling and annotation tool
A general fine-tuning kit geared toward image/video/audio diffusion
EPUB to audiobook converter, optimized for Audiobookshelf
Scalable data pre processing and curation toolkit for LLMs
Open source codebase for Scale Agentex
Qwen3-TTS is an open-source series of TTS models
Controllable & emotion-expressive zero-shot TTS
The official Python library for the OpenAI API
A high-quality rapid TTS voice cloning model
Streamlines and simplifies prompt design for both developers
Code and models for ICML 2024 paper, NExT-GPT
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Converts text to speech in realtime
Python inference and LoRA trainer package for the LTX-2 audio–video
Intelligent automation and multi-agent orchestration for Claude Code
AI tool converting video/audio into structured documents instantly
Qwen3-ASR is an open-source series of ASR models
A PyTorch-based Speech Toolkit
High-Quality Voice Cloning TTS for 600+ Languages
Python library and CLI tool to interface with Google Translate
Industrial-level controllable zero-shot text-to-speech system
A lightweight text-to-speech model with zero-shot voice cloning
Spring AI Alibaba examples for building and testing AI apps
AI-powered tool for generating, optimizing, and translating subtitles