Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
lmdeploy-0.7.3+cu118-cp38-cp38-manylinux2014_x86_64.whl | 2025-04-14 | 108.8 MB | |
lmdeploy-0.7.3+cu118-cp38-cp38-win_amd64.whl | 2025-04-14 | 47.9 MB | |
lmdeploy-0.7.3+cu118-cp39-cp39-manylinux2014_x86_64.whl | 2025-04-14 | 108.8 MB | |
lmdeploy-0.7.3+cu118-cp39-cp39-win_amd64.whl | 2025-04-14 | 47.9 MB | |
lmdeploy-0.7.3+cu118-cp310-cp310-manylinux2014_x86_64.whl | 2025-04-14 | 108.8 MB | |
lmdeploy-0.7.3+cu118-cp310-cp310-win_amd64.whl | 2025-04-14 | 47.9 MB | |
lmdeploy-0.7.3+cu118-cp311-cp311-manylinux2014_x86_64.whl | 2025-04-14 | 108.8 MB | |
lmdeploy-0.7.3+cu118-cp311-cp311-win_amd64.whl | 2025-04-14 | 47.9 MB | |
lmdeploy-0.7.3+cu118-cp312-cp312-manylinux2014_x86_64.whl | 2025-04-14 | 108.9 MB | |
lmdeploy-0.7.3+cu118-cp312-cp312-win_amd64.whl | 2025-04-14 | 47.9 MB | |
README.md | 2025-04-14 | 3.0 kB | |
v0.7.3 source code.tar.gz | 2025-04-14 | 1.2 MB | |
v0.7.3 source code.zip | 2025-04-14 | 1.9 MB | |
Totals: 13 Items | 787.0 MB | 0 |
What's Changed
🚀 Features
- Add Qwen3 and Qwen3MoE by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3305
- [Feature] support qwen3 and qwen3-moe for pytorch engine by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3315
- [ascend]support deepseekv2 by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/3206
- support ascend w8a8 graph_mode by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/3267
- support Llama4 by @grimoire in https://github.com/InternLM/lmdeploy/pull/3408
💥 Improvements
- Add spaces_between_special_tokens to /v1/interactive and make compatible with empty text by @AllentDan in https://github.com/InternLM/lmdeploy/pull/3283
- add env var to control timeout by @CUHKSZzxy in https://github.com/InternLM/lmdeploy/pull/3291
- optimize mla, remove load
v
by @grimoire in https://github.com/InternLM/lmdeploy/pull/3334 - refactor dlinfer rope by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/3326
- enable qwenvl2.5 graph mode on ascend by @jinminxi104 in https://github.com/InternLM/lmdeploy/pull/3367
- Optimize ascend moe by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/3364
- find port by @grimoire in https://github.com/InternLM/lmdeploy/pull/3429
🐞 Bug fixes
- fix activation grid oversize by @grimoire in https://github.com/InternLM/lmdeploy/pull/3282
- Set ensure_ascii=False for tool calling by @AllentDan in https://github.com/InternLM/lmdeploy/pull/3295
- add
v
check by @grimoire in https://github.com/InternLM/lmdeploy/pull/3307 - Fix Qwen3MoE config parsing by @lzhangzz in https://github.com/InternLM/lmdeploy/pull/3336
- Fix finish reasons by @AllentDan in https://github.com/InternLM/lmdeploy/pull/3338
- remove think_end_token_id in streaming content by @AllentDan in https://github.com/InternLM/lmdeploy/pull/3327
- Fix the finish_reason by @AllentDan in https://github.com/InternLM/lmdeploy/pull/3350
- support List[dict] prompt input without do_preprocess by @irexyc in https://github.com/InternLM/lmdeploy/pull/3385
- fix tensor dispatch in dynamo by @wanfengcxz in https://github.com/InternLM/lmdeploy/pull/3417
📚 Documentations
- update ascend doc by @yao-fengchen in https://github.com/InternLM/lmdeploy/pull/3420
🌐 Other
- bump version to v0.7.2.post1 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3298
- Optimize internvit by @caikun-pjlab in https://github.com/InternLM/lmdeploy/pull/3316
- bump version to v0.7.3 by @lvhan028 in https://github.com/InternLM/lmdeploy/pull/3416
New Contributors
- @wanfengcxz made their first contribution in https://github.com/InternLM/lmdeploy/pull/3417
- @caikun-pjlab made their first contribution in https://github.com/InternLM/lmdeploy/pull/3316
Full Changelog: https://github.com/InternLM/lmdeploy/compare/v0.7.2...v0.7.3