Text Generation Web UI - Browse /v4.9 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
textgen-portable-ik-4.9-windows-cuda12.4.zip	2026-05-20	1.3 GB	1
textgen-portable-ik-4.9-windows-cuda13.1.zip	2026-05-20	1.4 GB	8
textgen-portable-4.9-windows-cuda12.4.zip	2026-05-20	981.3 MB	0
textgen-portable-4.9-windows-rocm7.2.zip	2026-05-20	646.9 MB	1
textgen-portable-4.9-windows-cuda13.1.zip	2026-05-20	880.6 MB	9
textgen-portable-4.9-windows-cpu.zip	2026-05-20	334.9 MB	1
textgen-portable-ik-4.9-windows-cpu.zip	2026-05-20	351.7 MB	1
textgen-portable-4.9-windows-vulkan.zip	2026-05-20	352.2 MB	1
textgen-portable-ik-4.9-linux-cuda12.4.tar.gz	2026-05-20	1.3 GB	0
textgen-portable-4.9-macos-x86_64.tar.gz	2026-05-20	297.4 MB	3
textgen-portable-ik-4.9-linux-cuda13.1.tar.gz	2026-05-20	1.4 GB	1
textgen-portable-4.9-linux-cuda12.4.tar.gz	2026-05-20	936.7 MB	1
textgen-portable-4.9-linux-arm64-cuda13.1.tar.gz	2026-05-20	953.9 MB	1
textgen-portable-4.9-linux-cuda13.1.tar.gz	2026-05-20	866.4 MB	1
textgen-portable-4.9-linux-cpu.tar.gz	2026-05-20	322.4 MB	1
textgen-portable-4.9-linux-rocm7.2.tar.gz	2026-05-20	429.0 MB	1
textgen-portable-4.9-linux-vulkan.tar.gz	2026-05-20	339.8 MB	1
textgen-portable-ik-4.9-linux-cpu.tar.gz	2026-05-20	354.5 MB	0
textgen-portable-4.9-macos-arm64.tar.gz	2026-05-20	285.0 MB	1
README.md	2026-05-20	7.7 kB	0
v4.9 source code.tar.gz	2026-05-20	24.9 MB	1
v4.9 source code.zip	2026-05-20	25.1 MB	1
Totals: 22 Items		13.9 GB	35

Changes

MTP speculative decoding support: Add draft-mtp as a new --spec-type option. Auto-enabled when loading MTP GGUFs (e.g. Qwen 3.6 MoE MTP builds).
Web search improvements:
Add snippet support to the web_search tool: results now include a short text excerpt that often answers the query directly, eliminating the need for a follow-up fetch_webpage call (#7548).
Drop link URLs from fetch_webpage output (links now appear as plain text instead of [text](url) markdown), significantly reducing tokens used per page.
Prettier rendering of web_search results in the chat, with a spinner during the call.
Add an info message to the "Activate web search" checkbox.
Show live generation speed (tokens/s) and context size while generating (#7563).
DGX Spark support: Add Linux aarch64 portable builds.
Electron
Add "Check for updates" button in the Session tab.
Add a folder picker for the models directory.
Add right-click context menu for copying text.
Add a spellcheck toggle in the Session tab (#7550).
Store app data in user_data/cache/electron instead of the OS default location.
Disable DNS-over-HTTPS probes.
One-click installer: Track the latest release tag instead of bleeding-edge main.
Auto-detect and auto-select sibling mmproj files when loading a model (#7564).
Detect mmproj-*.gguf files in the main models folder: They appear in the mmproj dropdown and are hidden from the regular model dropdown.
Project icon: Add an icon, courtesy of LMLocalizer on Reddit.
Treat negative --ctx-size values as auto (0).
UI
Add drag-and-drop file upload support to the chat input (Gradio fork).
Reorganize the right sidebar with Mode/Character/Chat style on top.
Hide reasoning and tools controls in chat mode (only shown in instruct / chat-instruct).
Fade in new messages, fix scroll-up jump on send.
Rename "Send dummy message/reply" to "Insert user/assistant message".
Polish character dropdown in chat tab.
Tighten spacing between dropdowns and refresh buttons.
Improve the looks of the Session tab.

Security

Restrict CORS to localhost by default to prevent drive-by API access. --listen and --public-api opt into network exposure.
Sanitize character name in load_character to prevent path traversal.
fix: prevent path traversal in load_template_by_name (#7562). Thanks, @Allen930311.
UI: Improve web search security by rejecting non-HTTP links.

Bug fixes

Fix llama-server not being killed when the parent process exits on Windows, e.g. when closing the console window or killing python.exe (#7574).
Fix streaming output leaking across chats when switching mid-stream (#7555).
Fix continue-mode regressions across template families.
Fix incorrect prompts generated with continue mode. Thanks, @MeemeeLab.
Fix thinking channel being lost across tool-call turns (#7578).
Fix API model load silently dropping hyphenated arg keys (#7577).
Fix chat deletion failing when user_data/logs is a symlink (#7579).
Fix token count not being set in non-streaming mode.
Keep web search blocks closed when the user closes them mid-stream.
fix(win): set PYTHONUTF8 for non-ASCII locale Windows compatibility (#7560). Thanks, @jerry78424.
Set TORCH_VERSION to 2.9.0 to match xformers 0.0.33's torch pin (#7581). Thanks, @AJ-Gazin.

Dependency updates

Update llama.cpp to https://github.com/ggml-org/llama.cpp/commit/e947228222147356bc7e64154d3439e142481632
Update ik_llama.cpp to https://github.com/ikawrakow/ik_llama.cpp/commit/40254a51daf485b2b644bcb82a84278d95745ee5
Update ExLlamaV3 to 0.0.34

Portable builds

TextGen is now a desktop app for local LLMs. Download, unzip, double-click.

[!NOTE] NVIDIA GPU: If nvidia-smi reports CUDA Version >= 13.1, use the cuda13.1 build. Otherwise, use cuda12.4.

ik_llama.cpp is a llama.cpp fork (github.com) with new quant types. If unsure, use the llama.cpp column.

Windows

GPU/Platform	llama.cpp	ik_llama.cpp
NVIDIA (CUDA 12.4)	Download (936 MB)	Download (1.24 GB)
NVIDIA (CUDA 13.1)	Download (840 MB)	Download (1.33 GB)
AMD/Intel (Vulkan)	Download (336 MB)	—
AMD (ROCm 7.2)	Download (617 MB)	—
CPU only	Download (319 MB)	Download (335 MB)

Linux

GPU/Platform	llama.cpp	ik_llama.cpp
NVIDIA (CUDA 12.4)	Download (893 MB)	Download (1.21 GB)
NVIDIA (CUDA 13.1)	Download (826 MB)	Download (1.33 GB)
NVIDIA ARM64 (CUDA 13.1)	Download (910 MB)	—
AMD/Intel (Vulkan)	Download (324 MB)	—
AMD (ROCm 7.2)	Download (409 MB)	—
CPU only	Download (307 MB)	Download (338 MB)

macOS

macOS note: You need to run xattr -cr /path/to/your/textgen-folder on the extracted folder before launching. See https://github.com/oobabooga/textgen/issues/7558.

Architecture	llama.cpp
Apple Silicon (arm64)	Download (272 MB)
Intel (x86_64)	Download (284 MB)

Updating a portable install:

Download and extract the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

:::txt
textgen-4.6/
textgen-4.7/
user_data/    <-- shared by both installs

Source: README.md, updated 2026-05-20

Text Generation Web UI Files

Oobabooga - The definitive Web UI for local AI, with powerful features

Changes

Security

Bug fixes

Dependency updates