A TTS model capable of generating ultra-realistic dialogue
A python tool that uses GPT-4, FFmpeg, and OpenCV
A Systematic Framework for Interactive World Modeling
EPUB to audiobook converter, optimized for Audiobookshelf
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Open source platform for the machine learning lifecycle
High-Quality Voice Cloning TTS for 600+ Languages
Controllable & emotion-expressive zero-shot TTS
An event-driven framework designed to build multi-agent AI systems
LLM based data scientist, AI native data application
Open Source Speech Language Model
A general fine-tuning kit geared toward image/video/audio diffusion
Qwen3-TTS is an open-source series of TTS models
Industrial-level controllable zero-shot text-to-speech system
GPU environment management and cluster orchestration
A high-quality rapid TTS voice cloning model
Collect, organize, use, and share, all in OmniBox
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Framework for orchestrating role-playing, autonomous AI agents
State-of-the-art diffusion models for image and audio generation
AI tool converting video/audio into structured documents instantly
Code and models for ICML 2024 paper, NExT-GPT
ImageBind One Embedding Space to Bind Them All
The official Python library for the OpenAI API
An easy-to-use & supercharged open-source experiment tracker