Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Repo for SeedVR2 & SeedVR
Diversity-driven optimization and large-model reasoning ability
Chinese and English multimodal conversational language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Project Lyra: Open Generative 3D World Models
Open-weight, large-scale hybrid-attention reasoning model
Genome modeling and design across all domains of life
Inference script for Oasis 500M
Official implementation of DreamCraft3D
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A Systematic Framework for Interactive World Modeling
Open-source framework for intelligent speech interaction
A 0.1B Omni model trained from scratch
26m function call model that runs on incredibly small devices
Qwen3-ASR is an open-source series of ASR models
A Pragmatic VLA Foundation Model
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
Collection of Gemma 3 variants that are trained for performance
VMZ: Model Zoo for Video Modeling
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image