Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
lmdeploy-0.9.2+cu118-cp39-cp39-manylinux2014_x86_64.whl | 2025-07-26 | 66.6 MB | |
lmdeploy-0.9.2+cu118-cp39-cp39-win_amd64.whl | 2025-07-26 | 27.7 MB | |
lmdeploy-0.9.2+cu118-cp310-cp310-manylinux2014_x86_64.whl | 2025-07-26 | 66.6 MB | |
lmdeploy-0.9.2+cu118-cp310-cp310-win_amd64.whl | 2025-07-26 | 27.7 MB | |
lmdeploy-0.9.2+cu118-cp311-cp311-manylinux2014_x86_64.whl | 2025-07-26 | 66.6 MB | |
lmdeploy-0.9.2+cu118-cp311-cp311-win_amd64.whl | 2025-07-26 | 27.7 MB | |
lmdeploy-0.9.2+cu118-cp312-cp312-manylinux2014_x86_64.whl | 2025-07-26 | 66.6 MB | |
lmdeploy-0.9.2+cu118-cp312-cp312-win_amd64.whl | 2025-07-26 | 27.7 MB | |
lmdeploy-0.9.2+cu118-cp313-cp313-manylinux2014_x86_64.whl | 2025-07-26 | 66.6 MB | |
lmdeploy-0.9.2+cu118-cp313-cp313-win_amd64.whl | 2025-07-26 | 27.7 MB | |
README.md | 2025-07-26 | 5.2 kB | |
v0.9.2 source code.tar.gz | 2025-07-26 | 1.3 MB | |
v0.9.2 source code.zip | 2025-07-26 | 2.0 MB | |
Totals: 13 Items | 474.8 MB | 2 |
What's Changed
🚀 Features
- [Feature] metrics support by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3534
- Relax FP8 TP requirement by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3697
- FA3 by @zhaochaoxing in https://github.com/InternLM/lmdeploy/pull/3623
- support qwen2/2.5-vl in turbomind by @irexyc in https://github.com/InternLM/lmdeploy/pull/3744
- feat: add pytorch_engine_qwen2_5vl_sm120 by @kolmogorov-quyet in https://github.com/InternLM/lmdeploy/pull/3750
- Internvl pt by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3765
- Improve internvl for turbomind engine by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3769
💥 Improvements
- Refactor linear by @grimoire in https://github.com/InternLM/lmdeploy/pull/3653
- remove python3.8 support and add python3.13 support by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3638
- refactor vl inputs split by @grimoire in https://github.com/InternLM/lmdeploy/pull/3699
- [Fix]: Replace mutable default with default_factory for scheduler_stats by @ConvolutedDog in https://github.com/InternLM/lmdeploy/pull/3730
- Fix the logic of calculating max_new_tokens and determining finish_reason by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3727
- Override HF config.json via CLI by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3722
- feat(build): Integrate and build turbomind backend directly in setup.py by @windreamer in https://github.com/InternLM/lmdeploy/pull/3726
- Generate the benchmark output filename with given arguments by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3740
- Make loading llm without vlm as an option by @grimoire in https://github.com/InternLM/lmdeploy/pull/3745
🐞 Bug fixes
- add ray to ascend requirements by @sigma-plus in https://github.com/InternLM/lmdeploy/pull/3713
- fix accessing undefined attribute
seq_aux
of deepseek-r1-0528 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3728 - [Fix]: Avoid quantize qk norm for qwen3 dense models by @taishan1994 in https://github.com/InternLM/lmdeploy/pull/3733
- fix py313 env creation failed when building lmdeploy-builder image by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3739
- [Fix]: kernel meta retrieval for SM7X does not work by @xiaoajie738 in https://github.com/InternLM/lmdeploy/pull/3746
- limit max_session_len by @grimoire in https://github.com/InternLM/lmdeploy/pull/3751
- fix internvl norm by @grimoire in https://github.com/InternLM/lmdeploy/pull/3756
- support qwen3 moe yarn and vlm hf_overrides by @grimoire in https://github.com/InternLM/lmdeploy/pull/3757
- [PD Disaggregation] fix double unshelf by @JimyMa in https://github.com/InternLM/lmdeploy/pull/3762
- fix(build): fix version parse regex to support post-release versions by @windreamer in https://github.com/InternLM/lmdeploy/pull/3764
- adapt transformers>=v4.52.0 to loading qwen2.5-vl with turbomind by @irexyc in https://github.com/InternLM/lmdeploy/pull/3771
- fix chat template with tool call by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3773
- fix vl nothink mode by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3776
📚 Documentations
- update reward model docs by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3721
🌐 Other
- update twomicrobatch by @SHshenhao in https://github.com/InternLM/lmdeploy/pull/3651
- [CI]: Upgrade to py310 for ut by @RunningLeon in https://github.com/InternLM/lmdeploy/pull/3718
- [ci] update dailytest environment and scripts by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/3716
- Preliminary Blackwell (sm_120a, RTX 50 series) support by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3701
- [ci] add fp8 evaluation workflow by @zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/3729
- Add VRAM bandwidth utilization stat to attention test by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3731
- doc: fix dead links to MindX DL to recover CI. by @windreamer in https://github.com/InternLM/lmdeploy/pull/3741
- fix free cache in MPEngine branch by @JimyMa in https://github.com/InternLM/lmdeploy/pull/3670
- fix: make RelWithDebInfo default cmake build type by @windreamer in https://github.com/InternLM/lmdeploy/pull/3774
- bump version to v0.9.2 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3770
New Contributors
- @sigma-plus made their first contribution in https://github.com/InternLM/lmdeploy/pull/3713
- @ConvolutedDog made their first contribution in https://github.com/InternLM/lmdeploy/pull/3730
- @windreamer made their first contribution in https://github.com/InternLM/lmdeploy/pull/3726
- @taishan1994 made their first contribution in https://github.com/InternLM/lmdeploy/pull/3733
- @xiaoajie738 made their first contribution in https://github.com/InternLM/lmdeploy/pull/3746
- @kolmogorov-quyet made their first contribution in https://github.com/InternLM/lmdeploy/pull/3750
Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.9.1...v0.9.2