| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-12-26 | 4.3 kB | |
| v1.16.0 source code.tar.gz | 2025-12-26 | 54.9 MB | |
| v1.16.0 source code.zip | 2025-12-26 | 55.9 MB | |
| Totals: 3 Items | 110.8 MB | 0 | |
What's new in 1.16.0 (2025-12-27)
These are the changes in inference v1.16.0.
New features
- FEAT: [model] DeepSeek-V3.2-Exp support by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4374
- FEAT:Add vLLM backend support for DeepSeek-V3.2 by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4377
- FEAT:Add vLLM backend support for DeepSeek-V3.2-Exp. by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4375
- FEAT: vacc support by @ZhikaiGuo960110 in https://github.com/xorbitsai/inference/pull/4382
- FEAT: support vlm for vacc by @ZhikaiGuo960110 in https://github.com/xorbitsai/inference/pull/4385
- FEAT: [model] Fun-ASR-Nano-2512 support by @leslie2046 in https://github.com/xorbitsai/inference/pull/4397
- FEAT: [model] Qwen-Image-Layered support by @OliverBryant in https://github.com/xorbitsai/inference/pull/4395
- FEAT: [model] Fun-ASR-MLT-Nano-2512 support by @leslie2046 in https://github.com/xorbitsai/inference/pull/4398
- FEAT: continuous batching support for MLX chat models by @qinxuye in https://github.com/xorbitsai/inference/pull/4403
- FEAT: Add the architectures field for llm model launch by @OliverBryant in https://github.com/xorbitsai/inference/pull/4405
- FEAT: [UI] image models support configuration via environment variables and custom parameters. by @yiboyasss in https://github.com/xorbitsai/inference/pull/4413
- FEAT: support rerank async batch by @llyycchhee in https://github.com/xorbitsai/inference/pull/4414
- FEAT:Support VLLM backend for MiniMaxM2ForCausalLM by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4412
Enhancements
- ENH: fix assigning replica to make gpu idxes assigned continuous by @ZhikaiGuo960110 in https://github.com/xorbitsai/inference/pull/4370
- ENH: update model "DeepSeek-V3.2" JSON by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4381
- ENH: update model "glm-4.5" JSON by @OliverBryant in https://github.com/xorbitsai/inference/pull/4383
- ENH: update 2 models JSON ("glm-4.1v-thinking", "glm-4.5v") by @OliverBryant in https://github.com/xorbitsai/inference/pull/4384
- ENH: support torchaudio 2.9.0 by @llyycchhee in https://github.com/xorbitsai/inference/pull/4390
- ENH: update 3 models JSON ("llama-2-chat", "llama-3", "llama-3-instruct") by @OliverBryant in https://github.com/xorbitsai/inference/pull/4400
- ENH: update 4 models JSON ("llama-3.1", "llama-3.1-instruct", "llama-3.2-vision-instruct", ... +1 more) by @OliverBryant in https://github.com/xorbitsai/inference/pull/4401
- ENH: update model "jina-embeddings-v3" JSON by @XprobeBot in https://github.com/xorbitsai/inference/pull/4404
- ENH: update models JSON [audio, embedding, image, llm, video] by @XprobeBot in https://github.com/xorbitsai/inference/pull/4407
- ENH: update models JSON [audio, image] by @XprobeBot in https://github.com/xorbitsai/inference/pull/4408
- ENH: update model "Z-Image-Turbo" JSON by @OliverBryant in https://github.com/xorbitsai/inference/pull/4409
- ENH: update 2 models JSON ("DeepSeek-V3.2", "DeepSeek-V3.2-Exp") by @Jun-Howie in https://github.com/xorbitsai/inference/pull/4392
- ENH: update models JSON [llm] by @XprobeBot in https://github.com/xorbitsai/inference/pull/4415
- BLD: remove python 3.9 support by @OliverBryant in https://github.com/xorbitsai/inference/pull/4387
- BLD: Update Dockerfile to 12.9 to use VLLM v0.11.2 version by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4393
Bug fixes
- BUG: fix PaddleOCR-VL output by @leslie2046 in https://github.com/xorbitsai/inference/pull/4368
- BUG: custom embedding and rerank model analysis error by @OliverBryant in https://github.com/xorbitsai/inference/pull/4367
- BUG: cannot launch model on cpu && multi workers launch error by @OliverBryant in https://github.com/xorbitsai/inference/pull/4361
- BUG: OCR API return is null && add doc for how to modify model_size by @OliverBryant in https://github.com/xorbitsai/inference/pull/4331
- BUG: fix n_gpu parameter by @OliverBryant in https://github.com/xorbitsai/inference/pull/4411
Documentation
- DOC: update new models and release notes for v1.15.0 by @qinxuye in https://github.com/xorbitsai/inference/pull/4359
Full Changelog: https://github.com/xorbitsai/inference/compare/v1.15.0...v1.16.0