FlashMLA: Efficient Multi-head Latent Attention Kernels
MiniMax M2.1, a SOTA model for real-world dev & agents.
A Family of Open Foundation Models for Code Intelligence
Open-weight, large-scale hybrid-attention reasoning model
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Lightweight 24B agentic coding model with vision and long context