Download Latest Version llama-b7815-bin-910b-openEuler-x86-aclgraph.tar.gz (59.4 MB)
Email in envelope

Get an email when there's a new version of llama.cpp

Home / b7815
Name Modified Size InfoDownloads / Week
Parent folder
llama-b7815-xcframework.zip < 9 hours ago 173.9 MB
llama-b7815-bin-win-vulkan-x64.zip < 9 hours ago 46.5 MB
llama-b7815-bin-win-sycl-x64.zip < 9 hours ago 119.6 MB
llama-b7815-bin-win-opencl-adreno-arm64.zip < 9 hours ago 24.5 MB
llama-b7815-bin-win-hip-radeon-x64.zip < 9 hours ago 361.1 MB
llama-b7815-bin-win-cuda-13.1-x64.zip < 9 hours ago 144.6 MB
llama-b7815-bin-win-cuda-12.4-x64.zip < 9 hours ago 219.4 MB
llama-b7815-bin-win-cpu-x64.zip < 9 hours ago 30.0 MB
llama-b7815-bin-win-cpu-arm64.zip < 9 hours ago 23.7 MB
llama-b7815-bin-ubuntu-x64.tar.gz < 9 hours ago 23.7 MB
llama-b7815-bin-ubuntu-vulkan-x64.tar.gz < 9 hours ago 40.5 MB
llama-b7815-bin-ubuntu-s390x.tar.gz < 9 hours ago 24.7 MB
llama-b7815-bin-macos-x64.tar.gz < 9 hours ago 83.1 MB
llama-b7815-bin-macos-arm64.tar.gz < 9 hours ago 29.4 MB
llama-b7815-bin-910b-openEuler-x86-aclgraph.tar.gz < 9 hours ago 59.4 MB
llama-b7815-bin-910b-openEuler-aarch64-aclgraph.tar.gz < 9 hours ago 53.7 MB
llama-b7815-bin-310p-openEuler-x86.tar.gz < 9 hours ago 59.4 MB
llama-b7815-bin-310p-openEuler-aarch64.tar.gz < 9 hours ago 53.7 MB
cudart-llama-bin-win-cuda-13.1-x64.zip < 9 hours ago 402.6 MB
cudart-llama-bin-win-cuda-12.4-x64.zip < 9 hours ago 391.4 MB
b7815 source code.tar.gz < 9 hours ago 28.8 MB
b7815 source code.zip < 9 hours ago 29.8 MB
README.md < 9 hours ago 3.6 kB
Totals: 23 Items   2.4 GB 0
ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) (#18860) * Boilerplate for q5_Kx8 REPACK on ARM and fallback Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * Implements make_block_q5_Kx8 by extending make_block_q4_Kx8 Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * q5_K repack gemm and gemv generics * Gemm and Gemv ARM implementations (i8mm) * Improved qh manipulation looking at non-repack vec_dot implementation * Full unroll * Apply Q5_K Gemv vand and vshl optimizations to gemm. Improve comments. Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * Fix wrong fallback definitions of Q5_K Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * Fixed comments. Reverted unnecessary formatting Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * Fixed typo in generic definitions * Switching AND + Shift with Shift Insert. Better op interleaving. * Vectorize + unroll the block scales * Apply gemm optimizations to gemv * Improve bias calculation --------- Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Source: README.md, updated 2026-01-23