Powerful AI language model (MoE) optimized for efficiency/performance
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Chat & pretrained large audio language model proposed by Alibaba Cloud
Models for object and human mesh reconstruction
Diffusion Transformer with Fine-Grained Chinese Understanding
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
A state-of-the-art open visual language model
RGBD video generation model conditioned on camera input
Qwen-Image is a powerful image generation foundation model
A Customizable Image-to-Video Model based on HunyuanVideo
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Pokee Deep Research Model Open Source Repo
Multimodal-Driven Architecture for Customized Video Generation
Reference PyTorch implementation and models for DINOv3
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Example Discord bot written in Python that uses the completions API
The official repo of Qwen chat & pretrained large language model
FAIR Sequence Modeling Toolkit 2
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Chat & pretrained large vision language model
Qwen2.5-VL is the multimodal large language model series
Pushing the Limits of Mathematical Reasoning in Open Language Models
Official implementation of Watermark Anything with Localized Messages