LMDeploy - Browse /v0.9.1 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
lmdeploy-0.9.1+cu118-cp38-cp38-manylinux2014_x86_64.whl	2025-07-10	90.4 MB	0
lmdeploy-0.9.1+cu118-cp38-cp38-win_amd64.whl	2025-07-10	26.4 MB	0
lmdeploy-0.9.1+cu118-cp39-cp39-manylinux2014_x86_64.whl	2025-07-10	90.4 MB	0
lmdeploy-0.9.1+cu118-cp39-cp39-win_amd64.whl	2025-07-10	26.4 MB	0
lmdeploy-0.9.1+cu118-cp310-cp310-manylinux2014_x86_64.whl	2025-07-10	90.4 MB	0
lmdeploy-0.9.1+cu118-cp310-cp310-win_amd64.whl	2025-07-10	26.4 MB	1
lmdeploy-0.9.1+cu118-cp311-cp311-win_amd64.whl	2025-07-10	26.4 MB	0
lmdeploy-0.9.1+cu118-cp312-cp312-manylinux2014_x86_64.whl	2025-07-10	90.5 MB	0
lmdeploy-0.9.1+cu118-cp312-cp312-win_amd64.whl	2025-07-10	26.4 MB	0
lmdeploy-0.9.1+cu118-cp311-cp311-manylinux2014_x86_64.whl	2025-07-10	90.5 MB	0
README.md	2025-07-04	3.1 kB	0
v0.9.1 source code.tar.gz	2025-07-04	1.3 MB	0
v0.9.1 source code.zip	2025-07-04	1.9 MB	0
Totals: 13 Items		587.4 MB	1

What's Changed

feature: enable tool_call and reasoning_content parsing for qwen3 by @ywx217 in https://github.com/InternLM/lmdeploy/pull/3615
Support Mooncake migration backend for PD disaggregation by @Risc-lt in https://github.com/InternLM/lmdeploy/pull/3620
Support load fused moe weights by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3672
Seperate api_server and pytorch engine into different processors by @grimoire in https://github.com/InternLM/lmdeploy/pull/3627
add reward model api by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3665

[ascend]import patch at initiazing time by @JackWeiw in https://github.com/InternLM/lmdeploy/pull/3662
[ascend]use custon transdata in python kernel by @JackWeiw in https://github.com/InternLM/lmdeploy/pull/3671
move import transformers in patch by @grimoire in https://github.com/InternLM/lmdeploy/pull/3660
set ray envs by @grimoire in https://github.com/InternLM/lmdeploy/pull/3643
raise ImportError when enable ep and not install dlblas by @zhaochaoxing in https://github.com/InternLM/lmdeploy/pull/3636
Reduce sampling memory usage by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3666

fix dockerfile by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3657
Fix top-p only sampling with padded vocab size by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3661
fix pt engine stop & cancel by @irexyc in https://github.com/InternLM/lmdeploy/pull/3681
Fix convert bf16 to numpy by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3686
disable torch.compile in cuda graph runner by @grimoire in https://github.com/InternLM/lmdeploy/pull/3691
fix reward model api by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3703

Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.9.0...v0.9.1

Source: README.md, updated 2025-07-04