Download Latest Version llama-b8094-bin-910b-openEuler-x86-aclgraph.tar.gz (61.6 MB)
Email in envelope

Get an email when there's a new version of llama.cpp

Home / b8089
Name Modified Size InfoDownloads / Week
Parent folder
llama-b8089-xcframework.zip < 17 hours ago 167.3 MB
llama-b8089-bin-win-vulkan-x64.zip < 17 hours ago 47.7 MB
llama-b8089-bin-win-sycl-x64.zip < 17 hours ago 120.6 MB
llama-b8089-bin-win-opencl-adreno-arm64.zip < 17 hours ago 25.3 MB
llama-b8089-bin-win-hip-radeon-x64.zip < 17 hours ago 369.3 MB
llama-b8089-bin-win-cuda-13.1-x64.zip < 17 hours ago 148.5 MB
llama-b8089-bin-win-cuda-12.4-x64.zip < 17 hours ago 220.0 MB
llama-b8089-bin-win-cpu-x64.zip < 17 hours ago 31.0 MB
llama-b8089-bin-win-cpu-arm64.zip < 17 hours ago 24.4 MB
llama-b8089-bin-ubuntu-x64.tar.gz < 17 hours ago 24.6 MB
llama-b8089-bin-ubuntu-vulkan-x64.tar.gz < 17 hours ago 41.5 MB
llama-b8089-bin-ubuntu-s390x.tar.gz < 17 hours ago 25.5 MB
llama-b8089-bin-macos-x64.tar.gz < 17 hours ago 86.1 MB
llama-b8089-bin-macos-arm64.tar.gz < 17 hours ago 30.4 MB
llama-b8089-bin-910b-openEuler-x86-aclgraph.tar.gz < 17 hours ago 61.5 MB
llama-b8089-bin-910b-openEuler-aarch64-aclgraph.tar.gz < 17 hours ago 55.5 MB
llama-b8089-bin-310p-openEuler-x86.tar.gz < 17 hours ago 61.5 MB
llama-b8089-bin-310p-openEuler-aarch64.tar.gz < 17 hours ago 55.6 MB
cudart-llama-bin-win-cuda-13.1-x64.zip < 17 hours ago 402.6 MB
cudart-llama-bin-win-cuda-12.4-x64.zip < 17 hours ago 391.4 MB
b8089 source code.tar.gz < 18 hours ago 29.0 MB
b8089 source code.zip < 18 hours ago 30.1 MB
README.md < 18 hours ago 2.9 kB
Totals: 23 Items   2.4 GB 0
vulkan: split mul_mat into multiple dispatches to avoid overflow (#19509) * vulkan: split mul_mat into multiple dispatches to avoid overflow The batch dimensions can be greater than the max workgroup count limit, in which case we need to split into multiple dispatches and pass the base index through a push constant. Fall back for the less common p021 and nc variants. * address feedback

macOS/iOS:

Linux:

Windows:

openEuler:

Source: README.md, updated 2026-02-18