Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Image generation model with single-stream diffusion transformer
RGBD video generation model conditioned on camera input
Diffusion Transformer with Fine-Grained Chinese Understanding
Towards Human-Level Text-to-Speech through Style Diffusion
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Open-source multi-speaker long-form text-to-speech model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Multimodal Diffusion with Representation Alignment
A Unified Framework for Image Customization
Official code for Style Aligned Image Generation via Shared Attention
Next Generation AI One-Stop Internationalization Solution
A PyTorch library for implementing flow matching algorithms
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A fast TTS architecture with conditional flow matching
A Powerful Native Multimodal Model for Image Generation
Virtual AI anchor that combines state-of-the-art technology
A Universal Customization Method for Single and Multi Conditioning
Flexible Photo Recrafting While Preserving Your Identity
An Open Source text-to-speech system built by inverting Whisper
C++ inference library for multiple SVC/TTS
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Generate 3D objects conditioned on text or images
Plug-n-play module turning text-to-image models into animation