Text Generation Web UI - Browse /v4.3.1 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
textgen-portable-ik-4.3.1-windows-cuda12.4.zip	< 16 hours ago	1.2 GB	0
textgen-portable-4.3.1-windows-cuda12.4.zip	< 16 hours ago	794.7 MB	0
textgen-portable-ik-4.3.1-windows-cuda13.1.zip	< 16 hours ago	1.3 GB	0
textgen-portable-4.3.1-windows-cuda13.1.zip	< 16 hours ago	714.1 MB	0
textgen-portable-4.3.1-windows-vulkan.zip	< 16 hours ago	200.1 MB	0
textgen-portable-4.3.1-windows-cpu.zip	< 16 hours ago	183.2 MB	0
textgen-portable-ik-4.3.1-windows-cpu.zip	< 16 hours ago	183.3 MB	0
textgen-portable-ik-4.3.1-linux-cuda12.4.tar.gz	< 16 hours ago	1.2 GB	0
textgen-portable-4.3.1-linux-cuda12.4.tar.gz	< 16 hours ago	789.6 MB	0
textgen-portable-ik-4.3.1-linux-cuda13.1.tar.gz	< 16 hours ago	1.3 GB	0
textgen-portable-4.3.1-macos-x86_64.tar.gz	< 16 hours ago	187.7 MB	0
textgen-portable-4.3.1-linux-cuda13.1.tar.gz	< 16 hours ago	739.7 MB	0
textgen-portable-4.3.1-linux-rocm7.2.tar.gz	< 16 hours ago	338.2 MB	0
textgen-portable-4.3.1-linux-vulkan.tar.gz	< 16 hours ago	227.7 MB	0
textgen-portable-ik-4.3.1-linux-cpu.tar.gz	< 16 hours ago	221.3 MB	0
textgen-portable-4.3.1-linux-cpu.tar.gz	< 16 hours ago	210.5 MB	0
textgen-portable-4.3.1-macos-arm64.tar.gz	< 16 hours ago	181.4 MB	0
README.md	< 16 hours ago	3.5 kB	0
v4.3.1 source code.tar.gz	< 16 hours ago	24.9 MB	0
v4.3.1 source code.zip	< 16 hours ago	25.0 MB	0
Totals: 20 Items		10.0 GB	0

Changes

Gemma 4 support with full tool-calling in the API and UI. 🆕
ik_llama.cpp support: Add ik_llama.cpp as a new backend through new textgen-portable-ik portable builds and a new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference.
API: Add echo + logprobs for /v1/completions. The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field.
Further optimize my custom gradio fork, saving up to 50 ms per UI event (button click, etc).
Transformers: Autodetect torch_dtype from model config instead of always forcing bfloat16/float16. The --bf16 flag still works as an override.
Remove the obsolete models/config.yaml file. Instruction templates are now detected from model metadata instead of filename patterns.
Rename "truncation length" to "context length" in the terminal log message.

Security

Gradio fork: Fix ACL bypass via case-insensitive path matching on Windows/macOS.
Gradio fork: Add server-side validation for Dropdown, Radio, and CheckboxGroup.
Fix SSRF in superbooga extensions: URLs fetched by superbooga/superboogav2 are now validated to block requests to private/internal networks.

Bug fixes

Fix --idle-timeout failing on encode/decode requests and not tracking parallel generation properly.
Fix stopping string detection for chromadb/context-1 (<|return|> vs <|result|>).
Fix Qwen3.5 MoE failing to load via ExLlamav3_HF.
Fix ban_eos_token not working for ExLlamav3. EOS is now suppressed at the logit level.

Dependency updates

Update llama.cpp to https://github.com/ggml-org/llama.cpp/tree/a1cfb645307edc61a89e41557f290f441043d3c2 .
Adds Gemma-4 support
Adds improved KV cache quantization via activations rotation, based on TurboQuant https://github.com/ggml-org/llama.cpp/pull/21038
Update ExLlamaV3 to 0.0.28
Update transformers to 5.5

Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip/extract, and run.

Which version to download:

Windows/Linux:
NVIDIA GPU
- Older driver: Use cuda12.4.
- Newer driver (nvidia-smi reports CUDA Version >= 13.1): Use cuda13.1.
AMD/Intel GPU: Use vulkan.
AMD GPU (ROCm): Use rocm.
CPU only: Use cpu.
Mac:
Apple Silicon: Use macos-arm64.
Intel: Use macos-x86_64.

textgen-portable-ik is for ik-llama.cpp builds

Updating a portable install:

Download and extract the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

:::txt
text-generation-webui-4.0/
text-generation-webui-4.1/
user_data/                    <-- shared by both installs

Source: README.md, updated 2026-04-03

Text Generation Web UI Files

Oobabooga - The definitive Web UI for local AI, with powerful features

Changes

Security

Bug fixes

Dependency updates

Portable builds

Which version to download:

Updating a portable install:

Text Generation Web UI Files

Oobabooga - The definitive Web UI for local AI, with powerful features

Get an email when there's a new version of Text Generation Web UI

Changes

Security

Bug fixes

Dependency updates

Portable builds

Which version to download:

Updating a portable install: