Qwen3-Next
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens
Qwen3-Next-80B-A3B-Instruct is the flagship release in the Qwen3-Next series, designed as a next-generation foundation model for ultra-long context and efficient reasoning. With 80B total parameters and 3B activated at a time, it leverages hybrid attention (Gated DeltaNet + Gated Attention) and a high-sparsity Mixture-of-Experts architecture to achieve exceptional efficiency. The model natively supports a context length of 262K tokens and can be extended up to 1 million tokens using RoPE...