DeepSeek-V4-Flash is a preview Mixture-of-Experts language model built for efficient million-token context intelligence. It has 284B total parameters with 13B activated and supports a 1M-token context window, making it suitable for long-document reasoning, complex coding, agentic workflows, and large-scale information processing. The model uses a hybrid attention architecture that combines Compressed Sparse Attention and Heavily Compressed Attention to improve long-context efficiency, while Manifold-Constrained Hyper-Connections strengthen signal stability across layers. It is trained on more than 32T tokens and refined through a post-training pipeline that includes supervised fine-tuning, reinforcement learning, domain-specific expert cultivation, and on-policy distillation. DeepSeek-V4-Flash supports non-think, think, and think-max reasoning modes, allowing users to balance speed and depth. It is smaller than DeepSeek-V4-Pro but can approach Pro-level reasoning.

Features

  • 1M-token context window for ultra-long tasks
  • 284B total parameters with 13B activated
  • Mixture-of-Experts architecture for efficient inference
  • Hybrid attention using CSA and HCA mechanisms
  • Three reasoning modes: non-think, think, and think-max
  • Post-trained with SFT, RL, GRPO, and on-policy distillation
  • Strong coding, reasoning, and agentic benchmark performance
  • MIT-licensed weights for open model deployment

Project Samples

Project Activity

See All Activity >

Follow DeepSeek-V4-Flash

DeepSeek-V4-Flash Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DeepSeek-V4-Flash!

Additional Project Details

Registered

2026-04-24