Towards Human-Level Text-to-Speech through Style Diffusion
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Stable Diffusion built-in to Blender
Wan2.1: Open and Advanced Large-Scale Video Generative Model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Open-source multi-speaker long-form text-to-speech model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Personalize Any Characters with a Scalable Diffusion Transformer
Flutter-based cross-platform app integrating major AI models
InvokeAI is a leading creative engine for Stable Diffusion models
Multimodal Diffusion with Representation Alignment
A Unified Framework for Image Customization
State-of-the-art (SoTA) text-to-video pre-trained model
Official PyTorch Implementation
A SOTA open-source image editing model
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
A Rust machine learning framework
Image inpainting tool powered by SOTA AI Model
Official code for Style Aligned Image Generation via Shared Attention
High-Fidelity and Controllable Generation of Textured 3D Assets
A Fork from Github repository of Illyasviel's Forge
text and image to video generation: CogVideoX (2024) and CogVideo
Global weather forecasting model using graph neural networks and JAX
Generating Immersive, Explorable, and Interactive 3D Worlds
UI application to connect multiple AI models together