ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Cosmos-RL is a flexible and scalable Reinforcement Learning framework
Personalize Any Characters with a Scalable Diffusion Transformer
Image inpainting tool powered by SOTA AI Model
All-in-one WebUI for AI generative image and video creation
Official Python inference and LoRA trainer package
Code and models for ICML 2024 paper, NExT-GPT
Inference script for Oasis 500M
Official PyTorch Implementation
State-of-the-art (SoTA) text-to-video pre-trained model
A Unified Framework for Image Customization
A SOTA open-source image editing model
Modular AI image and video generation web UI with extensible tools
High-Fidelity and Controllable Generation of Textured 3D Assets
AI Image Upscaler & Enhancer
Marrying Grounding DINO with Segment Anything & Stable Diffusion
A PyTorch library for implementing flow matching algorithms
Official inference repo for FLUX.1 models
Virtual AI anchor that combines state-of-the-art technology
Continuation of NetherSX2 based on AetherSX2 3668
A Powerful Native Multimodal Model for Image Generation
text and image to video generation: CogVideoX (2024) and CogVideo
State-of-the-art Parameter-Efficient Fine-Tuning
A fast TTS architecture with conditional flow matching
Generating Immersive, Explorable, and Interactive 3D Worlds