Download Latest Version textgen-portable-3.12-macos-x86_64.zip (165.1 MB)
Email in envelope

Get an email when there's a new version of Text Generation Web UI

Home / v3.12
Name Modified Size InfoDownloads / Week
Parent folder
textgen-portable-3.12-windows-cuda12.4.zip 2025-09-02 841.0 MB
textgen-portable-3.12-windows-cuda11.7.zip 2025-09-02 729.0 MB
textgen-portable-3.12-windows-vulkan.zip 2025-09-02 207.6 MB
textgen-portable-3.12-windows-cpu.zip 2025-09-02 194.4 MB
textgen-portable-3.12-linux-cuda12.4.zip 2025-09-02 846.5 MB
textgen-portable-3.12-linux-cuda11.7.zip 2025-09-02 774.1 MB
textgen-portable-3.12-linux-vulkan.zip 2025-09-02 246.3 MB
textgen-portable-3.12-linux-cpu.zip 2025-09-02 232.9 MB
textgen-portable-3.12-macos-arm64.zip 2025-09-02 177.9 MB
textgen-portable-3.12-macos-x86_64.zip 2025-09-02 165.1 MB
README.md 2025-09-02 2.7 kB
v3.12 source code.tar.gz 2025-09-02 24.9 MB
v3.12 source code.zip 2025-09-02 25.0 MB
Totals: 13 Items   4.5 GB 21

Changes

  • Characters can now think in chat-instruct mode! This was possible thanks to many simplifications and improvements to jinja2 template handling:

  • Add support for the Seed-OSS-36B-Instruct template.
  • Better handle the growth of the chat input textarea:
Before After
before after
  • Make the --model flag work with absolute paths for gguf models, like --model /tmp/gemma-3-270m-it-IQ4_NL.gguf
  • Make venv portable installs work with Python 3.13
  • Optimize LaTeX rendering during streaming for long replies
  • Give streaming instruct messages more vertical space
  • Preload the instruct and chat fonts for smoother startup
  • Improve right sidebar borders in light mode
  • Remove the --flash-attn flag (it's always on now in llama.cpp)
  • Suppress "Attempted to select a non-interactive or hidden tab" console warnings, reducing the UI CPU usage during streaming
  • Statically link MSVC runtime to remove the Visual C++ Redistributable dependency on Windows for the llama.cpp binaries
  • Make the llama.cpp terminal output with --verbose less verbose

Bug fixes

  • llama.cpp: Fix stderr deadlock while loading some models
  • llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS
  • Fix the UI failing to launch if the Notebook prompt is too long
  • Fix LaTeX rendering for equations with asterisks
  • Fix italic and quote colors in headings

Backend updates


Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

  • Windows/Linux:
  • NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
  • AMD/Intel GPU: Use vulkan builds.
  • CPU only: Use cpu builds.

  • Mac:

  • Apple Silicon: Use macos-arm64.
  • Intel CPU: Use macos-x86_64.

Updating a portable install:

  1. Download and unzip the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.
Source: README.md, updated 2025-09-02