Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-06-27 | 4.1 kB | |
v1.7.1 source code.tar.gz | 2025-06-27 | 32.9 MB | |
v1.7.1 source code.zip | 2025-06-27 | 33.7 MB | |
Totals: 3 Items | 66.6 MB | 0 |
What's new in 1.7.1 (2025-06-27)
These are the changes in inference v1.7.1.
New features
- FEAT: [UI] enhance audio & rerank model registration params. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3656
- FEAT: support async client by @zhcn000000 in https://github.com/xorbitsai/inference/pull/3645
- FEAT: [UI] add max_tokens display in rerank model. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3671
- FEAT: [UI] add model_ability options for LLM registration. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3663
- FEAT: support qwenLong-l1 by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3691
- FEAT: [UI] model registration supports packages. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3702
- FEAT: support MLU device by @nan9126 in https://github.com/xorbitsai/inference/pull/3693
- FEAT: vllm v1 auto enabling by @qinxuye in https://github.com/xorbitsai/inference/pull/3637
- FEAT: distributed inference for MLX by @qinxuye in https://github.com/xorbitsai/inference/pull/3700
Enhancements
- ENH: add
enable_flash_attn
param for loading qwen3 embedding & rerank by @qinxuye in https://github.com/xorbitsai/inference/pull/3640 - ENH: add more abilities for builtin model families API by @qinxuye in https://github.com/xorbitsai/inference/pull/3658
- ENH: improve local cluster startup reliability via child-process readiness signaling by @Checkmate544 in https://github.com/xorbitsai/inference/pull/3642
- ENH: FishSpeech support pcm by @codingl2k1 in https://github.com/xorbitsai/inference/pull/3680
- ENH: Add 4-sample micro-batching to Qwen-3 reranker to reduce GPU memory by @yasu-oh in https://github.com/xorbitsai/inference/pull/3666
- ENH: Limit default n_parallel for llama.cpp backend by @codingl2k1 in https://github.com/xorbitsai/inference/pull/3712
- BLD: pin flash-attn & flashinfer-python version and limit sgl-kernel version by @amumu96 in https://github.com/xorbitsai/inference/pull/3669
- BLD: Update Dockerfile by @XiaoXiaoJiangYun in https://github.com/xorbitsai/inference/pull/3695
- REF: remove unused code by @qinxuye in https://github.com/xorbitsai/inference/pull/3664
Bug fixes
- BUG: fix TTS error bug :No such file or directory by @robin12jbj in https://github.com/xorbitsai/inference/pull/3625
- BUG: Fix max_tokens value in Qwen3 Reranker by @yasu-oh in https://github.com/xorbitsai/inference/pull/3665
- BUG: fix custom embedding by @qinxuye in https://github.com/xorbitsai/inference/pull/3677
- BUG: [UI] rename the command-line argument from download-hub to download_hub. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3685
- BUG: fix jina-clip-v2 for text only or image only by @qinxuye in https://github.com/xorbitsai/inference/pull/3690
- BUG: internvl chat error using vllm engine by @amumu96 in https://github.com/xorbitsai/inference/pull/3722
- BUG: fix the parsing logic of streaming tool calls by @amumu96 in https://github.com/xorbitsai/inference/pull/3721
- BUG: fix
<think>
wrongly added when setchat_template_kwargs {"enable_thinking": False}
by @qinxuye in https://github.com/xorbitsai/inference/pull/3718
Documentation
- DOC: add doc for paraformer by @leslie2046 in https://github.com/xorbitsai/inference/pull/3631
- DOC: Flexible model (traditional ML models) by @qinxuye in https://github.com/xorbitsai/inference/pull/3714
New Contributors
- @robin12jbj made their first contribution in https://github.com/xorbitsai/inference/pull/3625
- @zhcn000000 made their first contribution in https://github.com/xorbitsai/inference/pull/3645
- @yasu-oh made their first contribution in https://github.com/xorbitsai/inference/pull/3665
- @Checkmate544 made their first contribution in https://github.com/xorbitsai/inference/pull/3642
- @nan9126 made their first contribution in https://github.com/xorbitsai/inference/pull/3693
- @XiaoXiaoJiangYun made their first contribution in https://github.com/xorbitsai/inference/pull/3695
Full Changelog: https://github.com/xorbitsai/inference/compare/v1.7.0...v1.7.1