Suite of reference architectures for building GPU-accelerated vision
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Faster Whisper transcription with CTranslate2
Generate audiobooks from e-books, voice cloning & 1107+ languages
Repo of Qwen2-Audio chat & pretrained large audio language model
Use Microsoft Edge's online text-to-speech service from Python
High-Quality Voice Cloning TTS for 600+ Languages
Aider is AI pair programming in your terminal
Synchronized Translation for Videos
Deepfakes Software For All
Universal LLM Deployment Engine with ML Compilation
Kimi Code CLI is your next CLI agent
Tokenizer-Free TTS for Multilingual Speech Generation
LTX-Video Support for ComfyUI
Edit videos with Claude Code
A high-throughput and memory-efficient inference and serving engine
lightweight package to simplify LLM API calls
Qwen3-TTS is an open-source series of TTS models
Official repository for LTX-Video
Open-source AI agent framework
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Ready-to-use OCR with 80+ supported languages
AI agent harness for AI coding agents
Industrial-level controllable zero-shot text-to-speech system
Code for running inference and finetuning with SAM 3 model