From Images to High-Fidelity 3D Assets
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Global weather forecasting model using graph neural networks and JAX
Inference script for Oasis 500M
Research code artifacts for Code World Model (CWM)
OCR expert VLM powered by Hunyuan's native multimodal architecture
Towards Real-World Vision-Language Understanding
Official inference repo for FLUX.2 models
DeepSeek Coder: Let the Code Write Itself
A Customizable Image-to-Video Model based on HunyuanVideo
The official repo of Qwen chat & pretrained large language model
Z80-μLM is a 2-bit quantized language model
Foundation Models for Time Series
A Pragmatic VLA Foundation Model
Ling is a MoE LLM provided and open-sourced by InclusionAI
PyTorch code and models for the DINOv2 self-supervised learning
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Stable Diffusion with Core ML on Apple Silicon
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Diversity-driven optimization and large-model reasoning ability
Unified Multimodal Understanding and Generation Models
A SOTA open-source image editing model
Repo of Qwen2-Audio chat & pretrained large audio language model
High-Fidelity and Controllable Generation of Textured 3D Assets
Open-weight, large-scale hybrid-attention reasoning model