Uncommon Objects in 3D dataset
Capable of understanding text, audio, vision, video
Release for Improved Denoising Diffusion Probabilistic Models
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Designed for text embedding and ranking tasks
Open-source large language model family from Tencent Hunyuan
CodeGeeX2: A More Powerful Multilingual Code Generation Model
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Renderer for the harmony response format to be used with gpt-oss
A Powerful Native Multimodal Model for Image Generation
Qwen3-omni is a natively end-to-end, omni-modal LLM
AlphaFold 3 inference pipeline
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Chat & pretrained large audio language model proposed by Alibaba Cloud
The Clay Foundation Model - An open source AI model and interface
Sharp Monocular Metric Depth in Less Than a Second
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A series of math-specific large language models of our Qwen2 series
Implementation of the Surya Foundation Model for Heliophysics
Inference framework for 1-bit LLMs
Lets make video diffusion practical