Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
User-friendly AI Interface
A single Gradio + React WebUI with extensions for ACE-Step
Image generation model with single-stream diffusion transformer
Towards Human-Level Text-to-Speech through Style Diffusion
RGBD video generation model conditioned on camera input
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Diffusion Transformer with Fine-Grained Chinese Understanding
A simple, secure MCP-to-OpenAPI proxy server
Open-source multi-speaker long-form text-to-speech model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Multimodal Diffusion with Representation Alignment
Inference script for Oasis 500M
Next Generation AI One-Stop Internationalization Solution
A Unified Framework for Image Customization
A PyTorch library for implementing flow matching algorithms
Official inference repo for FLUX.1 models
A Powerful Native Multimodal Model for Image Generation
Virtual AI anchor that combines state-of-the-art technology
Speech-AI-Forge is a project developed around TTS generation model
A fast TTS architecture with conditional flow matching
An Open Source text-to-speech system built by inverting Whisper
Advanced language and coding AI model