DeepSeek-V3.1-Terminus is an updated release in the DeepSeek-V3.1 series, maintaining the original model’s large-scale reasoning and generative capabilities while addressing several key user-reported issues. It improves language consistency, reducing mixed Chinese-English outputs and eliminating abnormal characters, enhancing reliability in multilingual scenarios. The update also refines agentic capabilities, especially for the Code Agent and Search Agent, leading to better tool integration and query handling. Benchmarks show small but notable gains, such as raising MMLU-Pro from 84.8 to 85.0, GPQA-Diamond from 80.1 to 80.7, and SWE Verified from 66.0 to 68.4, along with significant improvements in agent benchmarks like BrowseComp (30.0 → 38.5) and Terminal-bench (31.3 → 36.7). The model structure remains the same as DeepSeek-V3, ensuring compatibility with existing deployment methods, with updated inference demos provided for community use.
Features
- 685B parameter model with BF16, FP8, and F32 tensor formats
- Improved multilingual consistency, reducing mixed-language outputs
- Optimized Code Agent and Search Agent for better tool integration
- Benchmark improvements in reasoning (MMLU-Pro 85.0, GPQA-Diamond 80.7)
- Stronger performance in agentic tasks (BrowseComp 38.5, Terminal-bench 36.7)
- Maintains DeepSeek-V3.1 architecture with updated inference demos
- Compatible with existing DeepSeek-V3 deployment methods (vLLM, SGLang)
- MIT-licensed for open use in research and development