slime LLM - Browse /v0.2.1 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-12-12	10.5 kB	0
v0.2.1 source code.tar.gz	2025-12-12	3.8 MB	0
v0.2.1 source code.zip	2025-12-12	4.1 MB	0
Totals: 3 Items		7.9 MB	0

Thanks to the incredible support and contributions from our community — v0.2.1 is here!

Major Updates

VLM + FSDP: true on-policy training on Qwen3-VL (dense).
PD-disaggregation support during rollout
DP-attention support in rollout routing replay (R3)
Upgraded to SGLang v0.5.6

What's Changed

extract mla update weight logic out by @zhuzilin in https://github.com/THUDM/slime/pull/960
support do all evals together by @zhuzilin in https://github.com/THUDM/slime/pull/959
Add --rollout-sample-filter-path by @zhuzilin in https://github.com/THUDM/slime/pull/961
[FSDP] Optimize FSDP2 Model Loading with Rank-0 Broadcast by @Hecate0821 in https://github.com/THUDM/slime/pull/915
Add sample.remove_sample by @zhuzilin in https://github.com/THUDM/slime/pull/977
add --eval-max-prompt-len by @zhuzilin in https://github.com/THUDM/slime/pull/978
Add args check for max_context_len by @zhuzilin in https://github.com/THUDM/slime/pull/979
Remove hard coded balance_abs_threshold by @zhuzilin in https://github.com/THUDM/slime/pull/981
Tiny fix fp8_cast_bf16 not copying chat template by @fzyzcjy in https://github.com/THUDM/slime/pull/964
Super tiny install dnsutils in dockerfile by @fzyzcjy in https://github.com/THUDM/slime/pull/965
Super tiny sanity check checkpoint dir by @fzyzcjy in https://github.com/THUDM/slime/pull/966
Fix convert_hf_to_torch_dist OOM by @fzyzcjy in https://github.com/THUDM/slime/pull/967
Tiny support using environment variables in addition to arguments for all scripts by @fzyzcjy in https://github.com/THUDM/slime/pull/968
Super tiny increase default timeout sec by @fzyzcjy in https://github.com/THUDM/slime/pull/969
Fix random port in use error even though already have free port detection by @fzyzcjy in https://github.com/THUDM/slime/pull/970
Super tiny enable draft-weights-cpu-backup to avoid MTP acc len issue by @fzyzcjy in https://github.com/THUDM/slime/pull/971
Add generation function for benchmarking purpose by @fzyzcjy in https://github.com/THUDM/slime/pull/972
Support zero host or device memory waste for weight update by @fzyzcjy in https://github.com/THUDM/slime/pull/973
Add fp8 kv cache and tis in qwen3 30b a3b script by @fzyzcjy in https://github.com/THUDM/slime/pull/974
Add GB200, MTP, benchmark, fp8 rollout mode to glm script by @fzyzcjy in https://github.com/THUDM/slime/pull/975
[FSDP] Add private func indicator for better usage by @PopSoda2002 in https://github.com/THUDM/slime/pull/982
[Bugfix] Rename save model by @PopSoda2002 in https://github.com/THUDM/slime/pull/983
Fix: resolve variable shadowing bug in setup_model_and_optimizer by @fangzhensheng in https://github.com/THUDM/slime/pull/963
remove unnecessary optimizer init by @zhuzilin in https://github.com/THUDM/slime/pull/984
[release] bump to v0.2.0.post1 by @zhuzilin in https://github.com/THUDM/slime/pull/986
fix scaling of per token loss by @zhuzilin in https://github.com/THUDM/slime/pull/987
Add strands-agents example by @Lawhy in https://github.com/THUDM/slime/pull/976
Add nemo skills evaluation by @guapisolo in https://github.com/THUDM/slime/pull/989
[1/N] Tiny execute Ruff auto lint by @fzyzcjy in https://github.com/THUDM/slime/pull/991
[2/N] Tiny manually fix for Ruff default ruleset and add to pre-commit by @fzyzcjy in https://github.com/THUDM/slime/pull/992
[3/N] Enable B ruleset in Ruff by @fzyzcjy in https://github.com/THUDM/slime/pull/993
[4/N] Tiny enable UP ruleset in Ruff by @fzyzcjy in https://github.com/THUDM/slime/pull/994
Super tiny further fix lint error by @fzyzcjy in https://github.com/THUDM/slime/pull/995
Add DataSource and --data-source-path by @zhuzilin in https://github.com/THUDM/slime/pull/912
Fix per token loss scale and add e2e ci by @zhuzilin in https://github.com/THUDM/slime/pull/990
[FSDP] Add script for FSDP Qwen3-4B by @Hecate0821 in https://github.com/THUDM/slime/pull/988
Fixed bug in checking max_length for SFT [#997] by @Surya-Gunukula in https://github.com/THUDM/slime/pull/998
[ci] Add CI to make sure all dense parallel gives the same grad norm by @zhuzilin in https://github.com/THUDM/slime/pull/1000
[Feature] Add off-policy sequence masking algorithm proposed in DeepSeek v3.2 by @yitianlian in https://github.com/THUDM/slime/pull/999
[FSDP][3/N] support true_on_policy training for FSDP2 by @zhuzilin in https://github.com/THUDM/slime/pull/1001
fix lint by @zhuzilin in https://github.com/THUDM/slime/pull/1002
Fix bare except clause and remove redundant computation in ppo_utils by @lancerts in https://github.com/THUDM/slime/pull/1007
fix: FSDP runnable for Qwen3-30b-a3b by @yueming-yuan in https://github.com/THUDM/slime/pull/1010
move tis function outside by @zhuzilin in https://github.com/THUDM/slime/pull/1014
Add backward impl for SiluAndMulFunction and MoeSumReduceFunction by @zhuzilin in https://github.com/THUDM/slime/pull/1015
refactor: expose compute_metrics_from_samples as public by @lancerts in https://github.com/THUDM/slime/pull/1012
Fix evaluation parameter parsing by @guapisolo in https://github.com/THUDM/slime/pull/1005
pre-commit run --all-files by @lancerts in https://github.com/THUDM/slime/pull/1021
fix: update deprecated import path in mcore2hf script by @Chen-GX in https://github.com/THUDM/slime/pull/1003
[FSDP] Add gpt oss 20b script by @PopSoda2002 in https://github.com/THUDM/slime/pull/996
Fix mimo speculative decoding oom by @guapisolo in https://github.com/THUDM/slime/pull/1024
[FSDP, VLM] feat: add vlm training for FSDP by @nanjiangwill in https://github.com/THUDM/slime/pull/501
[rollout] support disable trim samples when converting rollout samples to train datas by @GGGGGGXY in https://github.com/THUDM/slime/pull/1016
Backward compatible for older megatron version by @zhuzilin in https://github.com/THUDM/slime/pull/1028
extract all sglang deps in megatron actor to one file by @zhuzilin in https://github.com/THUDM/slime/pull/1029
feat: Add Unbiased KL Estimation from DeepSeek-V3.2 by @kekmodel in https://github.com/THUDM/slime/pull/1004
refactor: extract duplicated checkpoint interval logic into reusable helper by @lancerts in https://github.com/THUDM/slime/pull/1027
Fix typo in sglang_rollout.py comment by @ChenmienTan in https://github.com/THUDM/slime/pull/980
fix ci for nodes with proxy by @zhuzilin in https://github.com/THUDM/slime/pull/1035
[FSPP] fix args error in apply_fsdp2 function by @ChangyiYang in https://github.com/THUDM/slime/pull/1041
[FSDP] Support lr scheduler by @ChangyiYang in https://github.com/THUDM/slime/pull/1040
[Fix] Fix some bugs when on/offload model by @yitianlian in https://github.com/THUDM/slime/pull/1038
Improve debug output formatting in replay_reward_fn.py by @lancerts in https://github.com/THUDM/slime/pull/1033
Support pd disaggregation with p and d of same config by @zhuzilin in https://github.com/THUDM/slime/pull/1046
[rollout] Truncate last token for rollout routing replay by @Hecate0821 in https://github.com/THUDM/slime/pull/1045
fix: modernize type hint and add distributed init checks in utils by @lancerts in https://github.com/THUDM/slime/pull/1049
Fix the padding of rollout routing replay experts by @zhuzilin in https://github.com/THUDM/slime/pull/1052
update sglang to 0.5.6 by @lilei199908 in https://github.com/THUDM/slime/pull/1051
[docker] fix cudnn version by @zhuzilin in https://github.com/THUDM/slime/pull/1066
[docker] fix megatron cpu adam load issue by @zhuzilin in https://github.com/THUDM/slime/pull/1070
fix(examples): correct quotes and comment out ray cleanup commands in Qwen3-30B-A3B FP8 script by @pandengyao in https://github.com/THUDM/slime/pull/1069
Fix typos and improve clarity in documentation and code comments by @lancerts in https://github.com/THUDM/slime/pull/1067
fix: remove redundant gc.collect() and combine split f-strings by @lancerts in https://github.com/THUDM/slime/pull/1074
[FSDP, VLM] feat: true on policy for VLM by @nanjiangwill in https://github.com/THUDM/slime/pull/1056
[VLM, FSDP] Update Experiment Readme by @nanjiangwill in https://github.com/THUDM/slime/pull/1079
split train data in-advance to reduce communication by @zhuzilin in https://github.com/THUDM/slime/pull/1078
[Feature] PD Disaggregation Support by @yitianlian in https://github.com/THUDM/slime/pull/1080
fix raw_reward upload in fsdp by @zhuzilin in https://github.com/THUDM/slime/pull/1084
[FSDP][vlm] Add B200 doc by @PopSoda2002 in https://github.com/THUDM/slime/pull/1082
Add recompute loss function and enable by default by @zhuzilin in https://github.com/THUDM/slime/pull/1083
Empty cache before finalize_model_grads to prevent unexpected oom by @zhuzilin in https://github.com/THUDM/slime/pull/1086
Revert "Empty cache before finalize_model_grads to prevent unexpected oom" by @zhuzilin in https://github.com/THUDM/slime/pull/1087
Set --train-memory-margin-bytes to 1GB by default by @zhuzilin in https://github.com/THUDM/slime/pull/1088
set recompute_loss_function to false by default by @zhuzilin in https://github.com/THUDM/slime/pull/1089
[VLM] fix: fix non true-on-policy vlm regression by @nanjiangwill in https://github.com/THUDM/slime/pull/1093
fix_load_ckpt by @lilei199908 in https://github.com/THUDM/slime/pull/1095
fix actor init bugs by @lilei199908 in https://github.com/THUDM/slime/pull/1098
Fix gqa model tflops compute by @zhuzilin in https://github.com/THUDM/slime/pull/1099
Fix bug for convert_hf_to_torch_dist.py by @zhuzilin in https://github.com/THUDM/slime/pull/1100
[release] bump to v0.2.1 by @lilei199908 in https://github.com/THUDM/slime/pull/1096