Text Generation Web UI - Browse /v3.10 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
textgen-portable-3.10-windows-cuda11.7.zip	2025-08-12	729.6 MB	1
textgen-portable-3.10-windows-cuda12.4.zip	2025-08-12	841.7 MB	3
textgen-portable-3.10-windows-vulkan.zip	2025-08-12	202.6 MB	3
textgen-portable-3.10-windows-cpu.zip	2025-08-12	193.5 MB	2
textgen-portable-3.10-linux-cuda12.4.zip	2025-08-12	846.8 MB	1
textgen-portable-3.10-linux-cuda11.7.zip	2025-08-12	774.3 MB	1
textgen-portable-3.10-macos-x86_64.zip	2025-08-12	164.8 MB	12
textgen-portable-3.10-linux-cpu.zip	2025-08-12	231.7 MB	1
textgen-portable-3.10-linux-vulkan.zip	2025-08-12	240.8 MB	1
textgen-portable-3.10-macos-arm64.zip	2025-08-12	176.7 MB	1
README.md	2025-08-12	2.1 kB	1
v3.10 - Multimodal support! source code.tar.gz	2025-08-12	24.9 MB	0
v3.10 - Multimodal support! source code.zip	2025-08-12	25.0 MB	1
Totals: 13 Items		4.5 GB	28

See the Multimodal Tutorial

print6

Changes

Add multimodal support to the UI and API
With the llama.cpp loader (#7027). This was possible thanks to PR https://github.com/ggml-org/llama.cpp/pull/15108 to llama.cpp. Thanks @65a.
With ExLlamaV3 through a new ExLlamaV3 loader (#7174). Thanks @Katehuuh.
Add speculative decoding to the new ExLlamaV3 loader.
Use ExLlamav3 instead of ExLlamav3_HF by default for EXL3 models, since it supports multimodal and speculative decoding.
Support loading chat templates from chat_template.json files (EXL3/EXL2/Transformers models)
Default max_tokens to 512 in the API instead of 16
Better organize the right sidebar in the UI
llama.cpp: Pass --swa-full to llama-server when streaming-llm is checked to make it work for models with SWA.

Bug fixes

Fix getting the ctx-size for newer EXL3/EXL2/Transformers models
Fix the exllamav2 loader ignoring add_bos_token
Fix the color of italic text in chat messages
Fix edit window and buttons in Messenger theme (#7100). Thanks @mykeehu.

Backend updates

Bump llama.cpp to https://github.com/ggml-org/llama.cpp/commit/f4586ee5986d6f965becb37876d6f3666478a961

Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

Windows/Linux:
NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
AMD/Intel GPU: Use vulkan builds.
CPU only: Use cpu builds.
Mac:
Apple Silicon: Use macos-arm64.
Intel CPU: Use macos-x86_64.

Updating a portable install:

Download and unzip the latest version.
Replace the user_data folder with the one in your existing install. All your settings and models will be moved.

Source: README.md, updated 2025-08-12

Text Generation Web UI Files

A gradio web UI for running Large Language Models like LLaMA

Changes

Bug fixes

Backend updates

Portable builds

Which version to download:

Updating a portable install:

Text Generation Web UI Files

A gradio web UI for running Large Language Models like LLaMA

Get an email when there's a new version of Text Generation Web UI

Changes

Bug fixes

Backend updates

Portable builds

Which version to download:

Updating a portable install: