Command A+ 05-2026 W4A4 is a 4-bit quantized version of Cohere’s open-source Command A+ model, optimized for enterprise-grade agentic, multilingual, and reasoning-heavy workloads. It supports text and image inputs, generates text outputs, and uses a sparse Mixture-of-Experts Transformer architecture with 218B total parameters and 25B active parameters. The W4A4 release applies 4-bit weight and activation quantization mainly to MoE experts, preserving attention components at full precision to reduce quality loss while improving speed, latency, and hardware efficiency. Cohere recommends W4A4 for most users because it offers a smaller hardware footprint with negligible benchmark differences compared to BF16 and FP8 versions. The model supports a 128K input context and 64K output length, covers 48 languages, and includes conversational tool-use capabilities with JSON-schema tools and optional citation grounding.
Features
- 4-bit W4A4 quantization for efficient deployment
- 218B total parameters with 25B active parameters
- Sparse Mixture-of-Experts Transformer architecture
- Supports text and image inputs
- 128K input context and 64K output length
- Conversational tool use with JSON-schema tools
- Citation grounding for tool-supported answers
- Trained for 48 languages and enterprise workflows