| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-12-12 | 10.5 kB | |
| v0.2.1 source code.tar.gz | 2025-12-12 | 3.8 MB | |
| v0.2.1 source code.zip | 2025-12-12 | 4.1 MB | |
| Totals: 3 Items | 7.9 MB | 0 | |
Thanks to the incredible support and contributions from our community — v0.2.1 is here!
Major Updates
- VLM + FSDP: true on-policy training on Qwen3-VL (dense).
- PD-disaggregation support during rollout
- DP-attention support in rollout routing replay (R3)
- Upgraded to SGLang v0.5.6
What's Changed
- extract mla update weight logic out by @zhuzilin in https://github.com/THUDM/slime/pull/960
- support do all evals together by @zhuzilin in https://github.com/THUDM/slime/pull/959
- Add --rollout-sample-filter-path by @zhuzilin in https://github.com/THUDM/slime/pull/961
- [FSDP] Optimize FSDP2 Model Loading with Rank-0 Broadcast by @Hecate0821 in https://github.com/THUDM/slime/pull/915
- Add sample.remove_sample by @zhuzilin in https://github.com/THUDM/slime/pull/977
- add --eval-max-prompt-len by @zhuzilin in https://github.com/THUDM/slime/pull/978
- Add args check for max_context_len by @zhuzilin in https://github.com/THUDM/slime/pull/979
- Remove hard coded balance_abs_threshold by @zhuzilin in https://github.com/THUDM/slime/pull/981
- Tiny fix fp8_cast_bf16 not copying chat template by @fzyzcjy in https://github.com/THUDM/slime/pull/964
- Super tiny install dnsutils in dockerfile by @fzyzcjy in https://github.com/THUDM/slime/pull/965
- Super tiny sanity check checkpoint dir by @fzyzcjy in https://github.com/THUDM/slime/pull/966
- Fix convert_hf_to_torch_dist OOM by @fzyzcjy in https://github.com/THUDM/slime/pull/967
- Tiny support using environment variables in addition to arguments for all scripts by @fzyzcjy in https://github.com/THUDM/slime/pull/968
- Super tiny increase default timeout sec by @fzyzcjy in https://github.com/THUDM/slime/pull/969
- Fix random port in use error even though already have free port detection by @fzyzcjy in https://github.com/THUDM/slime/pull/970
- Super tiny enable draft-weights-cpu-backup to avoid MTP acc len issue by @fzyzcjy in https://github.com/THUDM/slime/pull/971
- Add generation function for benchmarking purpose by @fzyzcjy in https://github.com/THUDM/slime/pull/972
- Support zero host or device memory waste for weight update by @fzyzcjy in https://github.com/THUDM/slime/pull/973
- Add fp8 kv cache and tis in qwen3 30b a3b script by @fzyzcjy in https://github.com/THUDM/slime/pull/974
- Add GB200, MTP, benchmark, fp8 rollout mode to glm script by @fzyzcjy in https://github.com/THUDM/slime/pull/975
- [FSDP] Add private func indicator for better usage by @PopSoda2002 in https://github.com/THUDM/slime/pull/982
- [Bugfix] Rename save model by @PopSoda2002 in https://github.com/THUDM/slime/pull/983
- Fix: resolve variable shadowing bug in setup_model_and_optimizer by @fangzhensheng in https://github.com/THUDM/slime/pull/963
- remove unnecessary optimizer init by @zhuzilin in https://github.com/THUDM/slime/pull/984
- [release] bump to v0.2.0.post1 by @zhuzilin in https://github.com/THUDM/slime/pull/986
- fix scaling of per token loss by @zhuzilin in https://github.com/THUDM/slime/pull/987
- Add strands-agents example by @Lawhy in https://github.com/THUDM/slime/pull/976
- Add nemo skills evaluation by @guapisolo in https://github.com/THUDM/slime/pull/989
- [1/N] Tiny execute Ruff auto lint by @fzyzcjy in https://github.com/THUDM/slime/pull/991
- [2/N] Tiny manually fix for Ruff default ruleset and add to pre-commit by @fzyzcjy in https://github.com/THUDM/slime/pull/992
- [3/N] Enable
Bruleset in Ruff by @fzyzcjy in https://github.com/THUDM/slime/pull/993 - [4/N] Tiny enable
UPruleset in Ruff by @fzyzcjy in https://github.com/THUDM/slime/pull/994 - Super tiny further fix lint error by @fzyzcjy in https://github.com/THUDM/slime/pull/995
- Add DataSource and --data-source-path by @zhuzilin in https://github.com/THUDM/slime/pull/912
- Fix per token loss scale and add e2e ci by @zhuzilin in https://github.com/THUDM/slime/pull/990
- [FSDP] Add script for FSDP Qwen3-4B by @Hecate0821 in https://github.com/THUDM/slime/pull/988
- Fixed bug in checking max_length for SFT [#997] by @Surya-Gunukula in https://github.com/THUDM/slime/pull/998
- [ci] Add CI to make sure all dense parallel gives the same grad norm by @zhuzilin in https://github.com/THUDM/slime/pull/1000
- [Feature] Add off-policy sequence masking algorithm proposed in DeepSeek v3.2 by @yitianlian in https://github.com/THUDM/slime/pull/999
- [FSDP][3/N] support true_on_policy training for FSDP2 by @zhuzilin in https://github.com/THUDM/slime/pull/1001
- fix lint by @zhuzilin in https://github.com/THUDM/slime/pull/1002
- Fix bare except clause and remove redundant computation in ppo_utils by @lancerts in https://github.com/THUDM/slime/pull/1007
- fix: FSDP runnable for Qwen3-30b-a3b by @yueming-yuan in https://github.com/THUDM/slime/pull/1010
- move tis function outside by @zhuzilin in https://github.com/THUDM/slime/pull/1014
- Add backward impl for SiluAndMulFunction and MoeSumReduceFunction by @zhuzilin in https://github.com/THUDM/slime/pull/1015
- refactor: expose compute_metrics_from_samples as public by @lancerts in https://github.com/THUDM/slime/pull/1012
- Fix evaluation parameter parsing by @guapisolo in https://github.com/THUDM/slime/pull/1005
- pre-commit run --all-files by @lancerts in https://github.com/THUDM/slime/pull/1021
- fix: update deprecated import path in mcore2hf script by @Chen-GX in https://github.com/THUDM/slime/pull/1003
- [FSDP] Add gpt oss 20b script by @PopSoda2002 in https://github.com/THUDM/slime/pull/996
- Fix mimo speculative decoding oom by @guapisolo in https://github.com/THUDM/slime/pull/1024
- [FSDP, VLM] feat: add vlm training for FSDP by @nanjiangwill in https://github.com/THUDM/slime/pull/501
- [rollout] support disable trim samples when converting rollout samples to train datas by @GGGGGGXY in https://github.com/THUDM/slime/pull/1016
- Backward compatible for older megatron version by @zhuzilin in https://github.com/THUDM/slime/pull/1028
- extract all sglang deps in megatron actor to one file by @zhuzilin in https://github.com/THUDM/slime/pull/1029
- feat: Add Unbiased KL Estimation from DeepSeek-V3.2 by @kekmodel in https://github.com/THUDM/slime/pull/1004
- refactor: extract duplicated checkpoint interval logic into reusable helper by @lancerts in https://github.com/THUDM/slime/pull/1027
- Fix typo in sglang_rollout.py comment by @ChenmienTan in https://github.com/THUDM/slime/pull/980
- fix ci for nodes with proxy by @zhuzilin in https://github.com/THUDM/slime/pull/1035
- [FSPP] fix args error in apply_fsdp2 function by @ChangyiYang in https://github.com/THUDM/slime/pull/1041
- [FSDP] Support lr scheduler by @ChangyiYang in https://github.com/THUDM/slime/pull/1040
- [Fix] Fix some bugs when on/offload model by @yitianlian in https://github.com/THUDM/slime/pull/1038
- Improve debug output formatting in replay_reward_fn.py by @lancerts in https://github.com/THUDM/slime/pull/1033
- Support pd disaggregation with p and d of same config by @zhuzilin in https://github.com/THUDM/slime/pull/1046
- [rollout] Truncate last token for rollout routing replay by @Hecate0821 in https://github.com/THUDM/slime/pull/1045
- fix: modernize type hint and add distributed init checks in utils by @lancerts in https://github.com/THUDM/slime/pull/1049
- Fix the padding of rollout routing replay experts by @zhuzilin in https://github.com/THUDM/slime/pull/1052
- update sglang to 0.5.6 by @lilei199908 in https://github.com/THUDM/slime/pull/1051
- [docker] fix cudnn version by @zhuzilin in https://github.com/THUDM/slime/pull/1066
- [docker] fix megatron cpu adam load issue by @zhuzilin in https://github.com/THUDM/slime/pull/1070
- fix(examples): correct quotes and comment out ray cleanup commands in Qwen3-30B-A3B FP8 script by @pandengyao in https://github.com/THUDM/slime/pull/1069
- Fix typos and improve clarity in documentation and code comments by @lancerts in https://github.com/THUDM/slime/pull/1067
- fix: remove redundant gc.collect() and combine split f-strings by @lancerts in https://github.com/THUDM/slime/pull/1074
- [FSDP, VLM] feat: true on policy for VLM by @nanjiangwill in https://github.com/THUDM/slime/pull/1056
- [VLM, FSDP] Update Experiment Readme by @nanjiangwill in https://github.com/THUDM/slime/pull/1079
- split train data in-advance to reduce communication by @zhuzilin in https://github.com/THUDM/slime/pull/1078
- [Feature] PD Disaggregation Support by @yitianlian in https://github.com/THUDM/slime/pull/1080
- fix raw_reward upload in fsdp by @zhuzilin in https://github.com/THUDM/slime/pull/1084
- [FSDP][vlm] Add B200 doc by @PopSoda2002 in https://github.com/THUDM/slime/pull/1082
- Add recompute loss function and enable by default by @zhuzilin in https://github.com/THUDM/slime/pull/1083
- Empty cache before finalize_model_grads to prevent unexpected oom by @zhuzilin in https://github.com/THUDM/slime/pull/1086
- Revert "Empty cache before finalize_model_grads to prevent unexpected oom" by @zhuzilin in https://github.com/THUDM/slime/pull/1087
- Set --train-memory-margin-bytes to 1GB by default by @zhuzilin in https://github.com/THUDM/slime/pull/1088
- set recompute_loss_function to false by default by @zhuzilin in https://github.com/THUDM/slime/pull/1089
- [VLM] fix: fix non true-on-policy vlm regression by @nanjiangwill in https://github.com/THUDM/slime/pull/1093
- fix_load_ckpt by @lilei199908 in https://github.com/THUDM/slime/pull/1095
- fix actor init bugs by @lilei199908 in https://github.com/THUDM/slime/pull/1098
- Fix gqa model tflops compute by @zhuzilin in https://github.com/THUDM/slime/pull/1099
- Fix bug for convert_hf_to_torch_dist.py by @zhuzilin in https://github.com/THUDM/slime/pull/1100
- [release] bump to v0.2.1 by @lilei199908 in https://github.com/THUDM/slime/pull/1096
New Contributors
- @fangzhensheng made their first contribution in https://github.com/THUDM/slime/pull/963
- @Lawhy made their first contribution in https://github.com/THUDM/slime/pull/976
- @Surya-Gunukula made their first contribution in https://github.com/THUDM/slime/pull/998
- @nanjiangwill made their first contribution in https://github.com/THUDM/slime/pull/501
- @kekmodel made their first contribution in https://github.com/THUDM/slime/pull/1004
- @ChenmienTan made their first contribution in https://github.com/THUDM/slime/pull/980
- @pandengyao made their first contribution in https://github.com/THUDM/slime/pull/1069
Full Changelog: https://github.com/THUDM/slime/compare/v0.2.0...v0.2.1