The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp-linux-x64	2025-12-04	641.0 MB	9
koboldcpp.exe	2025-12-04	639.9 MB	22
koboldcpp-oldpc.exe	2025-12-04	458.3 MB	3
koboldcpp-nocuda.exe	2025-12-04	100.7 MB	0
koboldcpp-mac-arm64	2025-12-04	37.8 MB	0
koboldcpp-linux-x64-oldpc	2025-12-04	530.7 MB	2
koboldcpp-linux-x64-nocuda	2025-12-04	116.9 MB	1
koboldcpp-1.103 source code.tar.gz	2025-12-04	53.8 MB	0
koboldcpp-1.103 source code.zip	2025-12-04	54.6 MB	159
README.md	2025-12-04	3.9 kB	6
Totals: 10 Items		2.6 GB	202

koboldcpp-1.103

NEW: Added support for Flux2 and Z-Image Turbo! Another big thanks to @leejet for the sd.cpp implementation and @wbruna for the assistance with testing and merging.
To obtain models for Z-Image (Most recommended, lightweight):
- Get the Z-Image Image model here
- Get the Z-Image VAE here, which is the same vae as FluxOne.
- Get the Z-Image text encoder here (load this as Clip 1)
- Alternative: Load this template to download all 3 automatically
To obtain models for Flux2 (Not recommended as this model is huge so i will link the q2k. Remember to enable cpu offload. Running anything larger requires a very powerful GPU):
- Get the Flux 2 Image model here
- Get the Flux 2 VAE here
- Get the Flux 2 text encoder here, load this as Clip 1
NEW: Mistral and Ministral 3 model support has been merged from upstream.
Improved "Assistant Continue" in llama.cpp UI mode, now can be used to continue partial turns.
We have added prefill support to chat completions if you have /lcpp in your URL (/lcpp/v1/chat/completions), the regular chat completions is meant to mimick OpenAI and does not do this. Point your frontend to the URL that most fits your use case, we'd like feedback on which one of these you prefer and if the /lcpp behavior would break an existing use case.
Minor tool calling fix to avoid passing base64 media strings into the tool call.
Tweaked resizing behavior of the launcher UI.
Added a secondary terminal UI to view the console logging (only for Linux), can be used even when not launched from CLI. Launch this auxiliary terminal from the Extras tab.
AutoGuess Template fixes for GPT-OSS and Kimi
Fixed a bug with --showgui mode being saved into some configs
Updated Kobold Lite, multiple fixes and improvements
Merged fixes and improvements from upstream

Download and run the koboldcpp.exe (Windows) or koboldcpp-linux-x64 (Linux), which is a one-file pyinstaller for NVIDIA GPU users.
If you have an older CPU or older NVIDIA GPU and koboldcpp does not work, try oldpc version instead (Cuda11 + AVX1).
If you don't have an NVIDIA GPU, or do not need CUDA, you can use the nocuda version which is smaller.
If you're using AMD, we recommend trying the Vulkan option in the nocuda build for best support. If you're on a modern MacOS (M-Series) you can use the koboldcpp-mac-arm64 MacOS binary. Click here for .gguf conversion and quantization tools

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

Source: README.md, updated 2025-12-04