4M: Massively Multimodal Masked Modeling
Open-source large language model family from Tencent Hunyuan
Designed for text embedding and ranking tasks
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
A Pragmatic VLA Foundation Model
Collection of Gemma 3 variants that are trained for performance
Ling is a MoE LLM provided and open-sourced by InclusionAI
A state-of-the-art open visual language model
Diversity-driven optimization and large-model reasoning ability
Pokee Deep Research Model Open Source Repo
Long-form streaming TTS system for multi-speaker dialogue generation
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Block Diffusion for Ultra-Fast Speculative Decoding
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
The official PyTorch implementation of Google's Gemma models
Generate Any 3D Scene in Seconds
LLM-based Reinforcement Learning audio edit model
New family of code large language models (LLMs)
Multimodal embedding and reranking models built on Qwen3-VL
Unified Multimodal Understanding and Generation Models