Download Latest Version Release v0.4.8 source code.tar.gz (4.2 MB)
Email in envelope

Get an email when there's a new version of SGLang

Home / v0.4.7
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-06-10 72.0 kB
Release v0.4.7 source code.tar.gz 2025-06-10 4.1 MB
Release v0.4.7 source code.zip 2025-06-10 5.1 MB
Totals: 3 Items   9.2 MB 0

Highlights

  • The previously PD disaggregation and large-scale EP functionalities from the blog post have now been fully merged into the latest release.

  • The blog has been successfully reproduced by over six industry teams, including the TensorRT LLM team.

  • SGLang’s large-scale EP is now actively used by leading organizations such as Cursor, Qwen, Alimama, Alibaba Cloud, iFlytek, and more. It has been deployed and validated at large scale, running on GPU clusters with thousands of devices.

  • PD disaggregation and large-scale EP, in addition to supporting DeepSeek V3/R1, now also support Qwen 3 in the latest release.

  • Full Blackwell support for DeepSeek V3/R1, Llama 4, and Qwen 3. Further optimizations are underway.

  • SGLang's DeepSeek V3/R1 now achieves 190 TPS on single H200, outperforming other frameworks by over 50%.

We extend our sincere thanks to the following contributors, listed in alphabetical order: Alibaba Cloud, AMD Team, Ant Group, Baseten Team, Cursor Team, Dynamo Team, EAGLE Team, FlashInfer Team, Google Vertex AI Team, iFlytek MaaS Team, Intel Team, LinkedIn Team, Meituan Team, Microsoft Copilot Team, Mooncake Team, NVIDIA Team, Oracle Team, Qwen Team, Voltage Park Team and open source community users. Your support and collaboration are deeply appreciated!

What's Changed

New Contributors

Full Changelog: https://github.com/sgl-project/sglang/compare/v0.4.6...v0.4.7

Source: README.md, updated 2025-06-10