| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-29 | 5.5 kB | |
| v0.2.4 source code.tar.gz | 2026-03-29 | 5.5 MB | |
| v0.2.4 source code.zip | 2026-03-29 | 5.8 MB | |
| Totals: 3 Items | 11.3 MB | 1 | |
v0.2.4 is here! Thanks to everyone who contributed to this release.
Major Updates
In addition to a broad set of bug fixes and stability improvements, v0.2.4 brings several major updates:
- Profiling and observability improvements Added a rollout trace timeline viewer and W&B reporting for dynamic ITL / TTFT percentile metrics.
- Router stack unified on sgl-router Consolidated the router stack onto sgl-router and removed slime-router.
- Expanded multimodal and model support Improved support for GLM-4.6V / GLM4V, Multimodal OPD, and Qwen3.5-related workflows.
Other Notable Changes
- Fixed CUDA IPC cache leaks during weight updates
- Fixed SP/CP gradient inflation in FLA layers
What's Changed
- feat: add GLM-4.6V MoE VL bridge with CP support by @zhuzilin in https://github.com/THUDM/slime/pull/1715
- fix: resolve rope_theta from rope_parameters dict in HF config validation by @zhuzilin in https://github.com/THUDM/slime/pull/1720
- [docker] patches for glm4.6v, kimi k2.5 and dsa cp only by @zhuzilin in https://github.com/THUDM/slime/pull/1722
- Fix CUDA IPC cache leaks during weight updates by @zhuzilin in https://github.com/THUDM/slime/pull/1731
- [docker] update megatron by @zhuzilin in https://github.com/THUDM/slime/pull/1729
- [docker] Fix IndexCache with mla model by @zhuzilin in https://github.com/THUDM/slime/pull/1736
- [slime-router] support pd disaggregation and remove radix tree middleware by @zhuzilin in https://github.com/THUDM/slime/pull/1735
- Fix glm4v megatron bridge by @zhuzilin in https://github.com/THUDM/slime/pull/1738
- [docker] update sglang patch by @zhuzilin in https://github.com/THUDM/slime/pull/1743
- feat: GLM4V multimodal support improvements by @zhuzilin in https://github.com/THUDM/slime/pull/1745
- feat: placeholder worker type, metrics router, and GPQA letter range by @zhuzilin in https://github.com/THUDM/slime/pull/1746
- always enable_metrics and remove dp context by @zhuzilin in https://github.com/THUDM/slime/pull/1747
- fix: resolve SP/CP gradient inflation in FLA (linear attention) layers by @zhuzilin in https://github.com/THUDM/slime/pull/1748
- Update MTP example configs, rename GLM-4.5 to GLM-4.7, clean scripts by @zhuzilin in https://github.com/THUDM/slime/pull/1749
- Support qwen3.5 loss mask for multi-turn SFT by @huang3eng in https://github.com/THUDM/slime/pull/1742
- fix: propagate moe_token_dispatcher_type in bridge model provider by @nanjiangwill in https://github.com/THUDM/slime/pull/1737
- fix: resolve rope_theta from rope_parameters in DeepseekV32Bridge by @stevewx in https://github.com/THUDM/slime/pull/1734
- chore: translate remaining Chinese comments to English by @WangHong-yang in https://github.com/THUDM/slime/pull/1726
- feat: add Qwen3.5-4B model support by @shihaohou in https://github.com/THUDM/slime/pull/1721
- fix: http_utils. disable system proxy for internal SGLang httpx clients by @DongzhuoranZhou in https://github.com/THUDM/slime/pull/1714
- fix: auto-detect GPUs in qwen3-4b script by @ailuntz in https://github.com/THUDM/slime/pull/1700
- fix: quote
$MOE_LAYER_FREQby @lawrence-harmonic in https://github.com/THUDM/slime/pull/1689 - disable router health_check and allow prompt_data is None by @zhuzilin in https://github.com/THUDM/slime/pull/1751
- small fix on qwen3-235b-a22b launch script by @Zhuohao-Li in https://github.com/THUDM/slime/pull/1719
- sync internal bugfix by @zhuzilin in https://github.com/THUDM/slime/pull/1765
- Fix uploading sglang metrics to wandb by @zhuzilin in https://github.com/THUDM/slime/pull/1768
- use zhuzilin/sgl-router for sglang-router by @zhuzilin in https://github.com/THUDM/slime/pull/1770
- [docker] update sgl-router by @zhuzilin in https://github.com/THUDM/slime/pull/1772
- [Multimodal] Add Multimodal OPD support by @coding-famer in https://github.com/THUDM/slime/pull/1760
- refactor: remove slime router by @zhuzilin in https://github.com/THUDM/slime/pull/1773
- Add rollout trace timeline viewer by @zhuzilin in https://github.com/THUDM/slime/pull/1776
- [Fix] Fix duplicate Megatron LR scheduler resume when optimizer state is not loaded by @kaysonyu in https://github.com/THUDM/slime/pull/1775
- Support FP8 conversion for Qwen3.5 by @peterjc123 in https://github.com/THUDM/slime/pull/1769
- fix typo by @albaNnaksqr in https://github.com/THUDM/slime/pull/1759
- [Fix]Fix some bugs/clean up by @coding-famer in https://github.com/THUDM/slime/pull/1756
- (fix):not have encoder_only attr cause run failed by @wangyufak in https://github.com/THUDM/slime/pull/1741
New Contributors
- @stevewx made their first contribution in https://github.com/THUDM/slime/pull/1734
- @WangHong-yang made their first contribution in https://github.com/THUDM/slime/pull/1726
- @shihaohou made their first contribution in https://github.com/THUDM/slime/pull/1721
- @DongzhuoranZhou made their first contribution in https://github.com/THUDM/slime/pull/1714
- @ailuntz made their first contribution in https://github.com/THUDM/slime/pull/1700
- @peterjc123 made their first contribution in https://github.com/THUDM/slime/pull/1769
- @albaNnaksqr made their first contribution in https://github.com/THUDM/slime/pull/1759
- @wangyufak made their first contribution in https://github.com/THUDM/slime/pull/1741
Full Changelog: https://github.com/THUDM/slime/compare/v0.2.3...v0.2.4