Text Generation Web UI - Browse /v3.12 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
textgen-portable-3.12-windows-cuda12.4.zip	2025-09-02	841.0 MB	2
textgen-portable-3.12-windows-cuda11.7.zip	2025-09-02	729.0 MB	2
textgen-portable-3.12-windows-vulkan.zip	2025-09-02	207.6 MB	0
textgen-portable-3.12-windows-cpu.zip	2025-09-02	194.4 MB	7
textgen-portable-3.12-linux-cuda12.4.zip	2025-09-02	846.5 MB	1
textgen-portable-3.12-linux-cuda11.7.zip	2025-09-02	774.1 MB	0
textgen-portable-3.12-linux-vulkan.zip	2025-09-02	246.3 MB	0
textgen-portable-3.12-linux-cpu.zip	2025-09-02	232.9 MB	0
textgen-portable-3.12-macos-arm64.zip	2025-09-02	177.9 MB	0
textgen-portable-3.12-macos-x86_64.zip	2025-09-02	165.1 MB	9
README.md	2025-09-02	2.7 kB	0
v3.12 source code.tar.gz	2025-09-02	24.9 MB	0
v3.12 source code.zip	2025-09-02	25.0 MB	0
Totals: 13 Items		4.5 GB	21

Changes

Characters can now think in chat-instruct mode! This was possible thanks to many simplifications and improvements to jinja2 template handling:

Add support for the Seed-OSS-36B-Instruct template.
Better handle the growth of the chat input textarea:

Before	After

Make the --model flag work with absolute paths for gguf models, like --model /tmp/gemma-3-270m-it-IQ4_NL.gguf
Make venv portable installs work with Python 3.13
Optimize LaTeX rendering during streaming for long replies
Give streaming instruct messages more vertical space
Preload the instruct and chat fonts for smoother startup
Improve right sidebar borders in light mode
Remove the --flash-attn flag (it's always on now in llama.cpp)
Suppress "Attempted to select a non-interactive or hidden tab" console warnings, reducing the UI CPU usage during streaming
Statically link MSVC runtime to remove the Visual C++ Redistributable dependency on Windows for the llama.cpp binaries
Make the llama.cpp terminal output with --verbose less verbose

Bug fixes

llama.cpp: Fix stderr deadlock while loading some models
llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS
Fix the UI failing to launch if the Notebook prompt is too long
Fix LaTeX rendering for equations with asterisks
Fix italic and quote colors in headings

Backend updates

Update llama.cpp to https://github.com/ggml-org/llama.cpp/tree/9961d244f2df6baf40af2f1ddc0927f8d91578c8

Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

Windows/Linux:
NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
AMD/Intel GPU: Use vulkan builds.
CPU only: Use cpu builds.
Mac:
Apple Silicon: Use macos-arm64.
Intel CPU: Use macos-x86_64.

Updating a portable install:

Download and unzip the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Source: README.md, updated 2025-09-02

Text Generation Web UI Files

A gradio web UI for running Large Language Models like LLaMA

Changes

Bug fixes

Backend updates

Portable builds

Which version to download:

Updating a portable install:

Text Generation Web UI Files

A gradio web UI for running Large Language Models like LLaMA

Get an email when there's a new version of Text Generation Web UI

Changes

Bug fixes

Backend updates

Portable builds

Which version to download:

Updating a portable install: