From Images to High-Fidelity 3D Assets
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Advancing Open-source World Models
Generate Any 3D Scene in Seconds
State-of-the-art TTS model under 25MB
Open-source multi-speaker long-form text-to-speech model
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Foundation model for image generation
Multimodal Diffusion with Representation Alignment
A Conversational Speech Generation Model
Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201
Dia-1.6B generates lifelike English dialogue and vocal expressions