Download Latest Version Release v0.4.8 source code.tar.gz (4.2 MB)
Email in envelope

Get an email when there's a new version of SGLang

Home / v0.4.5
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-04-07 39.8 kB
Release v0.4.5 source code.tar.gz 2025-04-07 3.6 MB
Release v0.4.5 source code.zip 2025-04-07 4.3 MB
Totals: 3 Items   8.0 MB 0

Highlights

The SGLang team is excited to the release of v0.4.5! This version introduces several significant features, including Llama 4 support, FlashAttention 3 backend, EAGLE3 speculative decoding, DeepEP integration, and disaggregated prefill and decoding.

New Features

  • Llama 4 Support: We supported Llama 4 model with accuracy matching official benchmark numbers, achieving a zero-shot score of 75.2 on the MMLU Pro dataset for Llama-4-Scout-17B-16E-Instruct model and 80.7 for Llama-4-Maverick-17B-128E-Instruct model. https://github.com/sgl-project/sglang/pull/5092

  • FlashAttention 3 Backend: Our implementation of the FlashAttention 3 backend delivers significant acceleration for long-context tasks. https://github.com/sgl-project/sglang/issues/4709

  • EAGLE3 Speculative Decoding: We’re proud to be the first to support EAGLE3 speculative decoding, offering substantial gains in decoding throughput. Learn more in our documentation and the EAGLE3 paper. https://github.com/sgl-project/sglang/pull/4247

  • DeepEP Integration: By incorporating DeepEP, we enhanced performance for MoE inference.

  • Disaggregated Prefill and Decoding: We introduced a prototype for disaggregated prefill and decoding, with plans for further optimizations.

Thanks very much to the NVIDIA team, LinkedIn team, EAGLE team, Oracle team, Meituan team, and our incredible open-source community for their invaluable contributions!

Coming Soon

We’re thrilled about these advancements and eager to hear your feedback! Join us on our Slack channel at slack.sglang.ai to connect and share your thoughts. Cheers!

What's Changed

New Contributors

Full Changelog: https://github.com/sgl-project/sglang/compare/v0.4.4...v0.4.5

Source: README.md, updated 2025-04-07