Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
koboldcpp-nocuda.exe | 2025-06-25 | 75.6 MB | |
koboldcpp.exe | 2025-06-25 | 574.8 MB | |
koboldcpp-mac-arm64 | 2025-06-25 | 27.3 MB | |
koboldcpp-linux-x64-oldpc | 2025-06-25 | 468.4 MB | |
koboldcpp-linux-x64-nocuda | 2025-06-25 | 84.8 MB | |
koboldcpp-linux-x64 | 2025-06-25 | 568.6 MB | |
koboldcpp-oldpc.exe | 2025-06-25 | 403.1 MB | |
koboldcpp-1.94.1 source code.tar.gz | 2025-06-22 | 30.9 MB | |
koboldcpp-1.94.1 source code.zip | 2025-06-22 | 31.4 MB | |
README.md | 2025-06-22 | 5.6 kB | |
Totals: 10 Items | 2.3 GB | 5 |
koboldcpp-1.94.1
are we comfy yet?
- NEW: Added unpacked mini-launcher: Now when unpacking KoboldCpp to a directory, a 5MB mini pyinstaller launcher is also generated in that same directory, that allows you to easily start an unpacked KoboldCpp without needing to install python or other dependencies. You can copy the unpacked directory and use it anywhere (thanks @henk717)
- NEW: Chroma Image Generation Support: Merged support for the Chroma model, a new architecture based on Flux Schnell (thanks @stduhpf)
- This model also requires a T5-XXL encoder and Flux VAE to work, be sure to load all 3 files respectively!
- Chroma requires descriptive prompts and negative prompts to work well! Simple prompts will produce poor results.
- NEW: Added PhotoMaker Face Cloning Use
--sdphotomaker
to load PhotoMaker along with any SDXL based model. Then open KoboldCpp SDUI and upload any reference image in the PhotoMaker input to clone the face! Works in all modes (inpaint/img2img/text2img). - Swapping .gguf models in admin mode now allows overriding the config with a different one as well (both are customizable).
- Improve GNBF grammar performance by attempting culled grammar search first (thanks @Reithan)
- Allow changing the main GPU with
--maingpu
when loading multi-gpu setups. The main GPU uses more VRAM and has a larger performance impact. By default it is the first GPU. - Added configurable soft resolution limits and VAE tiling limits (thanks @wbruna), also fixed VAE tiling artifacts.
- Added
--sdclampedsoft
which provides "soft" total resolution clamping instead.(e.g. 640 would allow 640x640, 512x768 and 768x512 images), can be combined with--sdclamped
which provides hard clamping (no dimension can exceed it) - Added
--sdtiledvae
which replaces--sdnotile
: Allows specifying a size beyond which VAE tiling is applied. - Use
--embeddingsmaxctx
to limit the max context length for embedding models (if you run out of memory, this will help) - Added
--embeddingsgpu
to allow offloading embeddings model layers to GPU. This is NOT recommended as it doesn't provide much speedup, since embedding models already use the GPU for processing even without dedicated offload. - Display available RAM on startup, display version number in terminal window title
- ComfyUI emulation now covers the
/upload/image
endpoint which allows Img2Img comfyui workflows. Files are stored temporarily in memory only. - Added more performance stats for token speeds and timings.
- Updated Kobold Lite, multiple fixes and improvements
- Fixed Chub.ai importer again
- Added card importer for char-archive.evulid.cc
- Added option to import image from webcam
- Allow markdown when streaming current turn
- Improved CSS import sanitizer (thanks @PeterPeet)
- Word Frequency Search (inspired from @trincadev MyGhostWriter)
- Allow usermods and CSS to be loaded from file.
- Added WebSearch for corpo mode
- Added Img2Img support for ComfyUI backends
- Added ability to use custom OpenAI endpoint for TextDB embedding model
- Minor linting and splitter/merge tool by @ehoogeveen-medweb
- Fixed lookahead scanning for Author's note insertion point
- Merged new model support, fixes and improvements from upstream
Hotfix 1.94.1 - Minor bugfixes, fixed ollama compatible vision, added avx/avx2 detection for backend auto-selection, cleaned up oldpc builds to only include oldpc files.
Download and run the koboldcpp.exe (Windows) or koboldcpp-linux-x64 (Linux), which is a one-file pyinstaller for NVIDIA GPU users.
If you have an older CPU or older NVIDIA GPU and koboldcpp does not work, try oldpc version instead (Cuda11 + AVX1).
If you don't have an NVIDIA GPU, or do not need CUDA, you can use the nocuda version which is smaller.
If you're using AMD, we recommend trying the Vulkan option in the nocuda build first, for best support. Alternatively, you can try koboldcpp_rocm at YellowRoseCx's fork here if you are a Windows user or download our rolling ROCm binary here if you use Linux.
If you're on a modern MacOS (M-Series) you can use the koboldcpp-mac-arm64 MacOS binary.
Click here for .gguf conversion and quantization tools
Deprecation Reminder: Binary filenames have been renamed: The files named koboldcpp_cu12.exe
, koboldcpp_oldcpu.exe
, koboldcpp_nocuda.exe
, koboldcpp-linux-x64-cuda1210
, and koboldcpp-linux-x64-cuda1150
have been removed. Please switch to the new filenames.
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag. You can also refer to the readme and the wiki.