GLM-4.5-Air is a multilingual large language model with 106 billion total parameters and 12 billion active parameters, designed for conversational AI and intelligent agents. It is part of the GLM-4.5 family developed by Zhipu AI, offering hybrid reasoning capabilities via two modes: a thinking mode for complex reasoning and tool use, and a non-thinking mode for immediate responses. The model is optimized for efficiency and deployment, delivering strong results across 12 industry benchmarks, with a composite score of 59.8. GLM-4.5-Air supports both English and Chinese, and is suitable for tasks involving text generation, coding, reasoning, and tool calling. Open-sourced under the MIT license, it is commercially usable and integrates with transformers, vLLM, and SGLang inference frameworks. It includes FP8 variants for faster inference and reduced memory requirements. Despite its smaller size compared to full GLM-4.5, GLM-4.5-Air maintains high performance.
Features
- 106B total / 12B active parameters (Mixture-of-Experts design)
- Dual-mode hybrid reasoning: thinking & non-thinking
- Supports English and Chinese
- Strong benchmark results (score: 59.8 across 12 tests)
- Efficient FP8 and base versions available
- Open-source and commercially usable (MIT license)
- Built for intelligent agent applications
- Compatible with Transformers, vLLM, and SGLang Preguntar a ChatGPT