Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Audio foundation model excelling in audio understanding
Generating Immersive, Explorable, and Interactive 3D Worlds
Qwen-Image is a powerful image generation foundation model
Tooling for the Common Objects In 3D dataset
ICLR2024 Spotlight: curation/training code, metadata, distribution
New family of code large language models (LLMs)
Capable of understanding text, audio, vision, video
Implementation of model parallel autoregressive transformers on GPUs