Download Latest Version textgen-portable-3.10-macos-x86_64.zip (164.8 MB)
Email in envelope

Get an email when there's a new version of Text Generation Web UI

Home / v3.10
Name Modified Size InfoDownloads / Week
Parent folder
textgen-portable-3.10-windows-cuda11.7.zip 2025-08-12 729.6 MB
textgen-portable-3.10-windows-cuda12.4.zip 2025-08-12 841.7 MB
textgen-portable-3.10-windows-vulkan.zip 2025-08-12 202.6 MB
textgen-portable-3.10-windows-cpu.zip 2025-08-12 193.5 MB
textgen-portable-3.10-linux-cuda12.4.zip 2025-08-12 846.8 MB
textgen-portable-3.10-linux-cuda11.7.zip 2025-08-12 774.3 MB
textgen-portable-3.10-macos-x86_64.zip 2025-08-12 164.8 MB
textgen-portable-3.10-linux-cpu.zip 2025-08-12 231.7 MB
textgen-portable-3.10-linux-vulkan.zip 2025-08-12 240.8 MB
textgen-portable-3.10-macos-arm64.zip 2025-08-12 176.7 MB
README.md 2025-08-12 2.1 kB
v3.10 - Multimodal support! source code.tar.gz 2025-08-12 24.9 MB
v3.10 - Multimodal support! source code.zip 2025-08-12 25.0 MB
Totals: 13 Items   4.5 GB 28

See the Multimodal Tutorial

print6

Changes

  • Add multimodal support to the UI and API
  • With the llama.cpp loader (#7027). This was possible thanks to PR https://github.com/ggml-org/llama.cpp/pull/15108 to llama.cpp. Thanks @65a.
  • With ExLlamaV3 through a new ExLlamaV3 loader (#7174). Thanks @Katehuuh.
  • Add speculative decoding to the new ExLlamaV3 loader.
  • Use ExLlamav3 instead of ExLlamav3_HF by default for EXL3 models, since it supports multimodal and speculative decoding.
  • Support loading chat templates from chat_template.json files (EXL3/EXL2/Transformers models)
  • Default max_tokens to 512 in the API instead of 16
  • Better organize the right sidebar in the UI
  • llama.cpp: Pass --swa-full to llama-server when streaming-llm is checked to make it work for models with SWA.

Bug fixes

  • Fix getting the ctx-size for newer EXL3/EXL2/Transformers models
  • Fix the exllamav2 loader ignoring add_bos_token
  • Fix the color of italic text in chat messages
  • Fix edit window and buttons in Messenger theme (#7100). Thanks @mykeehu.

Backend updates


Portable builds

Below you can find self-contained packages that work with GGUF models (llama.cpp) and require no installation! Just download the right version for your system, unzip, and run.

Which version to download:

  • Windows/Linux:
  • NVIDIA GPU: Use cuda12.4 for newer GPUs or cuda11.7 for older GPUs and systems with older drivers.
  • AMD/Intel GPU: Use vulkan builds.
  • CPU only: Use cpu builds.

  • Mac:

  • Apple Silicon: Use macos-arm64.
  • Intel CPU: Use macos-x86_64.

Updating a portable install:

  1. Download and unzip the latest version.
  2. Replace the user_data folder with the one in your existing install. All your settings and models will be moved.
Source: README.md, updated 2025-08-12