Collection of Gemma 3 variants that are trained for performance
Code and models for ICML 2024 paper, NExT-GPT
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Easy-to-use and powerful NLP library with Awesome model zoo
Open source machine learning framework to automate text conversations
Synchronized Translation for Videos
Official Python inference and LoRA trainer package
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Open Source Document Management System for Digital Archives
World's first open-source, agentic video production system
Unified web UI for training and running open models locally
Large-language-model & vision-language-model based on Linear Attention
Knowledge Graph Generation from Any Text
Open-source multi-speaker long-form text-to-speech model
A Multi-Modal World Model for Reconstructing, Generating, Simulation
The most powerful local music generation model
A sound cloning tool with a web interface, using your voice
A Web UI for easy subtitle using whisper model
End-to-end speech processing toolkit
A high-quality PDF to Markdown tool based on large language model
Free, high-quality text-to-speech API endpoint to replace OpenAI
Enhances Tesseract OCR output using LLMs (local or API)
Autoregressive Model Beats Diffusion
tiktoken is a fast BPE tokeniser for use with OpenAI's models
General-purpose image editing model that delivers high-fidelity