A PyTorch library for implementing flow matching algorithms
Recovering the Visual Space from Any Views
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Models for object and human mesh reconstruction
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion
Multimodal Transformer for document image understanding and layout