Netease Youdao's open-source embedding and reranker models
Renderer for the harmony response format to be used with gpt-oss
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Multimodal embedding and reranking models built on Qwen3-VL
General-purpose image editing model that delivers high-fidelity
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
4M: Massively Multimodal Masked Modeling
A Customizable Image-to-Video Model based on HunyuanVideo
Qwen3-omni is a natively end-to-end, omni-modal LLM
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
Towards Real-World Vision-Language Understanding
Release for Improved Denoising Diffusion Probabilistic Models
LLaMA: Open and Efficient Foundation Language Models
Per-Pixel Classification is Not All You Need for Semantic Segmentation