The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2026-03-28	3.6 kB	0
v2.4.0 source code.tar.gz	2026-03-28	55.7 MB	0
v2.4.0 source code.zip	2026-03-28	56.7 MB	0
Totals: 3 Items		112.4 MB	0

What's new in 2.4.0 (2026-03-29)

These are the changes in inference v2.4.0.

New features

FEAT: introducing OTEL by @leslie2046 in https://github.com/xorbitsai/inference/pull/4666
FEAT: [UI] add Xagent link by @yiboyasss in https://github.com/xorbitsai/inference/pull/4693
FEAT: [UI] remove featured/all toggle and prioritize featured models by @yiboyasss in https://github.com/xorbitsai/inference/pull/4694
feat(vllm): support v0.18.0 by @llyycchhee in https://github.com/xorbitsai/inference/pull/4718
FEAT: add gpu load metrics by @leslie2046 in https://github.com/xorbitsai/inference/pull/4712
feat: Upgrade the base image to version 0.17.1 and add support for aarch64 version images by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4726
feat(ci): fix aarch64 build by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4735

ENH: update model "qwen3.5" JSON by @qinxuye in https://github.com/xorbitsai/inference/pull/4689
ENH: update model "qwen3.5" JSON by @llyycchhee in https://github.com/xorbitsai/inference/pull/4707
ENH: update models JSON [llm] by @XprobeBot in https://github.com/xorbitsai/inference/pull/4710
ENH: update models JSON [llm] by @XprobeBot in https://github.com/xorbitsai/inference/pull/4713
enh: adapt normalize param of vllm>0.16.0 for embedding models. by @la1ty in https://github.com/xorbitsai/inference/pull/4729
BLD: Requirements dependency version adjustment by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4736
bld: Requirements dependency version adjustment by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4737
bld: Requirements dependency version adjustment by @zwt-1234 in https://github.com/xorbitsai/inference/pull/4738
REF: parallelize supervisor model registration listing by @leslie2046 in https://github.com/xorbitsai/inference/pull/4690

BUG: Fix async client FormData handling and response lifecycle issues by @qinxuye in https://github.com/xorbitsai/inference/pull/4687
BUG: MLX backend accumulates intermediate generation steps into final output (tested on 1.17.0, 2.0.0, 2.1.0) [#4615] by @nasircsms in https://github.com/xorbitsai/inference/pull/4617
fix(worker): inject parent site-packages into child venv via .pth file by @nasircsms in https://github.com/xorbitsai/inference/pull/4692
BUG: launch multi gpu qwen3.5 error by @llyycchhee in https://github.com/xorbitsai/inference/pull/4700
fix(tool_call): add qwen3.5 by @llyycchhee in https://github.com/xorbitsai/inference/pull/4703
fix(qwen3.5): support tool calls by @llyycchhee in https://github.com/xorbitsai/inference/pull/4709
FIX: qwen3.5 reasoning parse by @llyycchhee in https://github.com/xorbitsai/inference/pull/4719
fix(qwen3.5): support XML-like tool call format in non-streaming mode by @amumu96 in https://github.com/xorbitsai/inference/pull/4715
FIX: webui crash when gpu_utilization is none by @leslie2046 in https://github.com/xorbitsai/inference/pull/4728

Full Changelog: https://github.com/xorbitsai/inference/compare/v2.3.0...v2.4.0

Source: README.md, updated 2026-03-28