GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses. They are released under the MIT license, allowing commercial use and secondary development. GLM-4.5 achieves strong performance on 12 industry-standard benchmarks, ranking 3rd overall, while GLM-4.5-Air balances competitive results with greater efficiency. The models support FP8 and BF16 precision, and can handle very large context windows of up to 128K tokens. Flexible inference is supported through frameworks like vLLM and SGLang with tool-call and reasoning parsers included.
Features
- Hybrid reasoning with both a “thinking” mode and a “non-thinking” (fast) mode
- Mixture-of-Experts (MoE) architecture to activate a subset of parameters and improve compute efficiency
- Support for tool usage / agentic capabilities (e.g. invoking external tools)
- Code generation / coding abilities integrated into the model’s capability
- Speculative decoding (MTP layers) to accelerate inference
- Multiple precision versions (e.g. BF16, FP8) and variants (full, Air) for trade-offs in performance and resource use
- Large-scale foundation model with 355B parameters (32B active) and compact 106B variant (12B active)
- High benchmark performance across 12 industry-standard tests, ranking 3rd overall
- Integrated tool-call and reasoning parsers compatible with vLLM and SGLang inference frameworks
- Supports hybrid reasoning with thinking and non-thinking modes for flexible interaction
- Supports FP8 and BF16 precision for efficient inference on modern GPUs
- Supports fine-tuning via LoRA, supervised fine-tuning (SFT), and reinforcement learning (RL)
- Open-source under MIT license, enabling commercial and secondary development
- Extremely long context length of up to 128,000 tokens for complex, large-scale tasks
License
Apache License V2.0Follow GLM-4.5
User Reviews
-
One of the best open source AI models for sure