TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Repo for SeedVR2 & SeedVR
Open-source multi-speaker long-form text-to-speech model
Official SeedVR2 Video Upscaler for ComfyUI
Inference script for Oasis 500M
InvokeAI is a leading creative engine for Stable Diffusion models
Official PyTorch Implementation
Multimodal Diffusion with Representation Alignment
Personalize Any Characters with a Scalable Diffusion Transformer
State-of-the-art (SoTA) text-to-video pre-trained model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Deep learning framework
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Image inpainting tool powered by SOTA AI Model
Stable-diffusion-webui-pixelization
A Unified Framework for Image Customization
A SOTA open-source image editing model
High-Fidelity and Controllable Generation of Textured 3D Assets
text and image to video generation: CogVideoX (2024) and CogVideo
A PyTorch library for implementing flow matching algorithms
A Powerful Native Multimodal Model for Image Generation
Generating Immersive, Explorable, and Interactive 3D Worlds
State-of-the-art Parameter-Efficient Fine-Tuning
Enables the best performance on NVIDIA RTX Graphics Cards
An Open Source text-to-speech system built by inverting Whisper