Download Latest Version koboldcpp-1.103 source code.zip (54.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.103
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp-linux-x64 2025-12-04 641.0 MB
koboldcpp.exe 2025-12-04 639.9 MB
koboldcpp-oldpc.exe 2025-12-04 458.3 MB
koboldcpp-nocuda.exe 2025-12-04 100.7 MB
koboldcpp-mac-arm64 2025-12-04 37.8 MB
koboldcpp-linux-x64-oldpc 2025-12-04 530.7 MB
koboldcpp-linux-x64-nocuda 2025-12-04 116.9 MB
koboldcpp-1.103 source code.tar.gz 2025-12-04 53.8 MB
koboldcpp-1.103 source code.zip 2025-12-04 54.6 MB
README.md 2025-12-04 3.9 kB
Totals: 10 Items   2.6 GB 202

koboldcpp-1.103

image

  • NEW: Added support for Flux2 and Z-Image Turbo! Another big thanks to @leejet for the sd.cpp implementation and @wbruna for the assistance with testing and merging.
  • To obtain models for Z-Image (Most recommended, lightweight):
  • To obtain models for Flux2 (Not recommended as this model is huge so i will link the q2k. Remember to enable cpu offload. Running anything larger requires a very powerful GPU):
    • Get the Flux 2 Image model here
    • Get the Flux 2 VAE here
    • Get the Flux 2 text encoder here, load this as Clip 1
  • NEW: Mistral and Ministral 3 model support has been merged from upstream.
  • Improved "Assistant Continue" in llama.cpp UI mode, now can be used to continue partial turns.
  • We have added prefill support to chat completions if you have /lcpp in your URL (/lcpp/v1/chat/completions), the regular chat completions is meant to mimick OpenAI and does not do this. Point your frontend to the URL that most fits your use case, we'd like feedback on which one of these you prefer and if the /lcpp behavior would break an existing use case.
  • Minor tool calling fix to avoid passing base64 media strings into the tool call.
  • Tweaked resizing behavior of the launcher UI.
  • Added a secondary terminal UI to view the console logging (only for Linux), can be used even when not launched from CLI. Launch this auxiliary terminal from the Extras tab.
  • AutoGuess Template fixes for GPT-OSS and Kimi
  • Fixed a bug with --showgui mode being saved into some configs
  • Updated Kobold Lite, multiple fixes and improvements
  • Merged fixes and improvements from upstream

Download and run the koboldcpp.exe (Windows) or koboldcpp-linux-x64 (Linux), which is a one-file pyinstaller for NVIDIA GPU users.
If you have an older CPU or older NVIDIA GPU and koboldcpp does not work, try oldpc version instead (Cuda11 + AVX1).
If you don't have an NVIDIA GPU, or do not need CUDA, you can use the nocuda version which is smaller.
If you're using AMD, we recommend trying the Vulkan option in the nocuda build for best support. If you're on a modern MacOS (M-Series) you can use the koboldcpp-mac-arm64 MacOS binary. Click here for .gguf conversion and quantization tools

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

Source: README.md, updated 2025-12-04