Download Latest Version llama-b8067-bin-910b-openEuler-x86-aclgraph.tar.gz (61.6 MB)
Email in envelope

Get an email when there's a new version of llama.cpp

Home / b8064
Name Modified Size InfoDownloads / Week
Parent folder
llama-b8064-xcframework.zip < 19 hours ago 179.5 MB
llama-b8064-bin-win-vulkan-x64.zip < 19 hours ago 47.7 MB
llama-b8064-bin-win-sycl-x64.zip < 19 hours ago 120.6 MB
llama-b8064-bin-win-opencl-adreno-arm64.zip < 19 hours ago 25.3 MB
llama-b8064-bin-win-hip-radeon-x64.zip < 19 hours ago 369.3 MB
llama-b8064-bin-win-cuda-13.1-x64.zip < 19 hours ago 148.5 MB
llama-b8064-bin-win-cuda-12.4-x64.zip < 19 hours ago 220.0 MB
llama-b8064-bin-win-cpu-x64.zip < 19 hours ago 31.0 MB
llama-b8064-bin-win-cpu-arm64.zip < 19 hours ago 24.4 MB
llama-b8064-bin-ubuntu-x64.tar.gz < 19 hours ago 24.7 MB
llama-b8064-bin-ubuntu-vulkan-x64.tar.gz < 19 hours ago 41.5 MB
llama-b8064-bin-ubuntu-s390x.tar.gz < 19 hours ago 25.6 MB
llama-b8064-bin-macos-x64.tar.gz < 19 hours ago 86.1 MB
llama-b8064-bin-macos-arm64.tar.gz < 19 hours ago 30.4 MB
llama-b8064-bin-910b-openEuler-x86-aclgraph.tar.gz < 19 hours ago 61.6 MB
llama-b8064-bin-910b-openEuler-aarch64-aclgraph.tar.gz < 19 hours ago 55.6 MB
llama-b8064-bin-310p-openEuler-x86.tar.gz < 19 hours ago 61.6 MB
llama-b8064-bin-310p-openEuler-aarch64.tar.gz < 19 hours ago 55.6 MB
cudart-llama-bin-win-cuda-13.1-x64.zip < 19 hours ago 402.6 MB
cudart-llama-bin-win-cuda-12.4-x64.zip < 19 hours ago 391.4 MB
b8064 source code.tar.gz < 21 hours ago 29.0 MB
b8064 source code.zip < 21 hours ago 30.1 MB
README.md < 21 hours ago 3.1 kB
Totals: 23 Items   2.5 GB 0
cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization (#19624) * cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization - load all 8 int8 for a grid position in one load - calculate signs via popcnt instead of fetching from ksigns table - broadcast signs to drop individual shift/mask * cuda: iq2xxs: simplify sum scaling express `(sum * scale + sum / 2) / 4` as `(sum * (scale * 2 + 1)) / 8` express `((aux32 >> 28) * 2 + 1)` as `(aux32 >> 27 | 1)` saves 3 registers for mul_mat_vec_q (152 -> 149) according to nsight AFAICT no overflow can occur here as iq2xxs values are far too small * uint -> uint32_t error: identifier "uint" is undefined

macOS/iOS:

Linux:

Windows:

openEuler:

Source: README.md, updated 2026-02-15