The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp-1.92 source code.tar.gz	2025-05-24	31.5 MB	0
koboldcpp-1.92 source code.zip	2025-05-24	31.9 MB	0
README.md	2025-05-24	4.6 kB	0
koboldcpp_tools_24may2025.zip	2025-05-24	19.8 MB	0
koboldcpp-linux-x64-cuda1210	2025-05-24	559.8 MB	0
koboldcpp-linux-x64-cuda1150	2025-05-24	480.4 MB	0
koboldcpp_oldcpu.exe	2025-05-24	418.5 MB	0
koboldcpp_nocuda.exe	2025-05-24	72.7 MB	0
koboldcpp_cu12.exe	2025-05-24	569.1 MB	0
koboldcpp.exe	2025-05-24	418.4 MB	0
koboldcpp-mac-arm64	2025-05-24	27.1 MB	0
koboldcpp-linux-x64-nocuda	2025-05-24	78.8 MB	0
Totals: 12 Items		2.7 GB	0

koboldcpp-1.92

early bug is for the birds edition

Added support for SWA mode which uses much less memory for the KV cache, use --useswa to enable.
Note: SWA mode is not compatible with ContextShifting, and may result in degraded output when used with FastForwarding.
Fixed an off-by-one error in some cases when Fast Forwarding that resulted in degraded output.
Greatly improved tool calling by enforcing grammar on the output field names, and doing the automatic tool selection as a separate pass. Tool calling should be much more reliable now.
Added model size information in the HF Huggingface Search and download menu
CLI terminal output is now truncated in the middle of very long strings instead of at the end.
Fixed unicode path handling for Image Generation models.
Enabled threadpools, this should result in a speedup for Qwen3MoE.
Merged Vision support for Llama4 models, simplified some vision preprocessing code.
Fixes for prompt formatting for GLM4 models. GLM4 batch processing on Vulkan is fixed (thanks @0cc4m).
Fixed incorrect AutoGuess adapter for some Mistral models. Also fixed some KoboldCppAutomatic placeholder tag replacements.
AI Horde default advertised context now matches main max context by default. This can be changed.
Disable --showgui if --skiplauncher is used
StableUI now increments clip_skip and seed correctly when generating multiple images in a batch (thanks @wbruna)
clip_skip is now stored inside image metadata, and random seed's actual number is also indicated.
Added DDIM sampler for image generation.
Added a simple optional python reqs install script in launch.cmd for launching when run from unpacked directories.
Updated Kobold Lite, multiple fixes and improvements
Integrated dPaste.org (open source pastebin) as a platform for quickly sharing Save Files. You can also use a self hosted instance by changing the endpoint URL. You can now share stories as a single URL with Save/Load > Share > Export Share as Web URL
Added an option to allow Horizontal Stacking of multiple images in one row.
Fixed importing of Chub.AI character cards as they changed their endpoint.
Added support for RisuAI V3 character cards (.charx archive format), also fixed KAISTORY handling.
SSE streaming is now the default for all cases. It can be disabled in Advanced Settings.
Changed markdown renderer to render markdown separately for each instruct turn.
Better passthrough for KoboldCppAutomatic instruct preset, especially with split tags.
Added an option to use TTS from Pollinations API, which routes through OpenAI TTS models. Note that this TTS service has a server-side censorship via a content filter that I cannot control.
Lite now sends stop sequences in OpenAI Chat Completions Endpoint mode (up to 4)
Added ST based randomizer macros like {{roll:3d6}} (thanks @hu-yijie)
Added new Immortal sampler preset by Jeb Carter
In polled streaming mode, you can fetch last generated text if the request fails halfway.
Added an exit button when editing raw text in corpo mode.
Re-enabled a debug option for using raw placeholder tags on request. Not recommended.
Added a debug option that allows changing the connected API at runtime.
Merged fixes and improvements from upstream

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3 etc) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, we recommend trying the Vulkan option (available in all releases) first, for best support.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

Source: README.md, updated 2025-05-24