Download Latest Version textgen-portable-4.9-macos-x86_64.tar.gz (297.4 MB)
Email in envelope

Get an email when there's a new version of Text Generation Web UI

Home / v4.9
Name Modified Size InfoDownloads / Week
Parent folder
textgen-portable-ik-4.9-windows-cuda12.4.zip 2026-05-20 1.3 GB
textgen-portable-ik-4.9-windows-cuda13.1.zip 2026-05-20 1.4 GB
textgen-portable-4.9-windows-cuda12.4.zip 2026-05-20 981.3 MB
textgen-portable-4.9-windows-rocm7.2.zip 2026-05-20 646.9 MB
textgen-portable-4.9-windows-cuda13.1.zip 2026-05-20 880.6 MB
textgen-portable-4.9-windows-cpu.zip 2026-05-20 334.9 MB
textgen-portable-ik-4.9-windows-cpu.zip 2026-05-20 351.7 MB
textgen-portable-4.9-windows-vulkan.zip 2026-05-20 352.2 MB
textgen-portable-ik-4.9-linux-cuda12.4.tar.gz 2026-05-20 1.3 GB
textgen-portable-4.9-macos-x86_64.tar.gz 2026-05-20 297.4 MB
textgen-portable-ik-4.9-linux-cuda13.1.tar.gz 2026-05-20 1.4 GB
textgen-portable-4.9-linux-cuda12.4.tar.gz 2026-05-20 936.7 MB
textgen-portable-4.9-linux-arm64-cuda13.1.tar.gz 2026-05-20 953.9 MB
textgen-portable-4.9-linux-cuda13.1.tar.gz 2026-05-20 866.4 MB
textgen-portable-4.9-linux-cpu.tar.gz 2026-05-20 322.4 MB
textgen-portable-4.9-linux-rocm7.2.tar.gz 2026-05-20 429.0 MB
textgen-portable-4.9-linux-vulkan.tar.gz 2026-05-20 339.8 MB
textgen-portable-ik-4.9-linux-cpu.tar.gz 2026-05-20 354.5 MB
textgen-portable-4.9-macos-arm64.tar.gz 2026-05-20 285.0 MB
README.md 2026-05-20 7.7 kB
v4.9 source code.tar.gz 2026-05-20 24.9 MB
v4.9 source code.zip 2026-05-20 25.1 MB
Totals: 22 Items   13.9 GB 35

Changes

  • MTP speculative decoding support: Add draft-mtp as a new --spec-type option. Auto-enabled when loading MTP GGUFs (e.g. Qwen 3.6 MoE MTP builds).
  • Web search improvements:
  • Add snippet support to the web_search tool: results now include a short text excerpt that often answers the query directly, eliminating the need for a follow-up fetch_webpage call (#7548).
  • Drop link URLs from fetch_webpage output (links now appear as plain text instead of [text](url) markdown), significantly reducing tokens used per page.
  • Prettier rendering of web_search results in the chat, with a spinner during the call.
  • Add an info message to the "Activate web search" checkbox.
  • Show live generation speed (tokens/s) and context size while generating (#7563).
  • DGX Spark support: Add Linux aarch64 portable builds.
  • Electron
  • Add "Check for updates" button in the Session tab.
  • Add a folder picker for the models directory.
  • Add right-click context menu for copying text.
  • Add a spellcheck toggle in the Session tab (#7550).
  • Store app data in user_data/cache/electron instead of the OS default location.
  • Disable DNS-over-HTTPS probes.
  • One-click installer: Track the latest release tag instead of bleeding-edge main.
  • Auto-detect and auto-select sibling mmproj files when loading a model (#7564).
  • Detect mmproj-*.gguf files in the main models folder: They appear in the mmproj dropdown and are hidden from the regular model dropdown.
  • Project icon: Add an icon, courtesy of LMLocalizer on Reddit.
  • Treat negative --ctx-size values as auto (0).
  • UI
  • Add drag-and-drop file upload support to the chat input (Gradio fork).
  • Reorganize the right sidebar with Mode/Character/Chat style on top.
  • Hide reasoning and tools controls in chat mode (only shown in instruct / chat-instruct).
  • Fade in new messages, fix scroll-up jump on send.
  • Rename "Send dummy message/reply" to "Insert user/assistant message".
  • Polish character dropdown in chat tab.
  • Tighten spacing between dropdowns and refresh buttons.
  • Improve the looks of the Session tab.

Security

  • Restrict CORS to localhost by default to prevent drive-by API access. --listen and --public-api opt into network exposure.
  • Sanitize character name in load_character to prevent path traversal.
  • fix: prevent path traversal in load_template_by_name (#7562). Thanks, @Allen930311.
  • UI: Improve web search security by rejecting non-HTTP links.

Bug fixes

  • Fix llama-server not being killed when the parent process exits on Windows, e.g. when closing the console window or killing python.exe (#7574).
  • Fix streaming output leaking across chats when switching mid-stream (#7555).
  • Fix continue-mode regressions across template families.
  • Fix incorrect prompts generated with continue mode. Thanks, @MeemeeLab.
  • Fix thinking channel being lost across tool-call turns (#7578).
  • Fix API model load silently dropping hyphenated arg keys (#7577).
  • Fix chat deletion failing when user_data/logs is a symlink (#7579).
  • Fix token count not being set in non-streaming mode.
  • Keep web search blocks closed when the user closes them mid-stream.
  • fix(win): set PYTHONUTF8 for non-ASCII locale Windows compatibility (#7560). Thanks, @jerry78424.
  • Set TORCH_VERSION to 2.9.0 to match xformers 0.0.33's torch pin (#7581). Thanks, @AJ-Gazin.

Dependency updates

Portable builds

TextGen is now a desktop app for local LLMs. Download, unzip, double-click.

[!NOTE] NVIDIA GPU: If nvidia-smi reports CUDA Version >= 13.1, use the cuda13.1 build. Otherwise, use cuda12.4.

ik_llama.cpp is a llama.cpp fork (github.com) with new quant types. If unsure, use the llama.cpp column.

Windows

GPU/Platform llama.cpp ik_llama.cpp
NVIDIA (CUDA 12.4) Download (936 MB) Download (1.24 GB)
NVIDIA (CUDA 13.1) Download (840 MB) Download (1.33 GB)
AMD/Intel (Vulkan) Download (336 MB)
AMD (ROCm 7.2) Download (617 MB)
CPU only Download (319 MB) Download (335 MB)

Linux

GPU/Platform llama.cpp ik_llama.cpp
NVIDIA (CUDA 12.4) Download (893 MB) Download (1.21 GB)
NVIDIA (CUDA 13.1) Download (826 MB) Download (1.33 GB)
NVIDIA ARM64 (CUDA 13.1) Download (910 MB)
AMD/Intel (Vulkan) Download (324 MB)
AMD (ROCm 7.2) Download (409 MB)
CPU only Download (307 MB) Download (338 MB)

macOS

macOS note: You need to run xattr -cr /path/to/your/textgen-folder on the extracted folder before launching. See https://github.com/oobabooga/textgen/issues/7558.

Architecture llama.cpp
Apple Silicon (arm64) Download (272 MB)
Intel (x86_64) Download (284 MB)

Updating a portable install:

  1. Download and extract the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Starting with 4.0, you can also move user_data one folder up, next to the install folder. It will be detected automatically, making updates easier:

:::txt
textgen-4.6/
textgen-4.7/
user_data/    <-- shared by both installs
Source: README.md, updated 2026-05-20