llama.cpp - Browse /b7845 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b7845-xcframework.zip	< 10 hours ago	174.4 MB	0
llama-b7845-bin-win-vulkan-x64.zip	< 10 hours ago	46.8 MB	0
llama-b7845-bin-win-sycl-x64.zip	< 10 hours ago	119.8 MB	0
llama-b7845-bin-win-opencl-adreno-arm64.zip	< 10 hours ago	24.6 MB	0
llama-b7845-bin-win-hip-radeon-x64.zip	< 10 hours ago	361.7 MB	0
llama-b7845-bin-win-cuda-13.1-x64.zip	< 10 hours ago	145.4 MB	0
llama-b7845-bin-win-cuda-12.4-x64.zip	< 10 hours ago	220.6 MB	0
llama-b7845-bin-win-cpu-x64.zip	< 10 hours ago	30.2 MB	0
llama-b7845-bin-win-cpu-arm64.zip	< 10 hours ago	23.8 MB	0
llama-b7845-bin-ubuntu-x64.tar.gz	< 10 hours ago	23.8 MB	0
llama-b7845-bin-ubuntu-vulkan-x64.tar.gz	< 10 hours ago	40.7 MB	0
llama-b7845-bin-ubuntu-s390x.tar.gz	< 10 hours ago	24.8 MB	0
llama-b7845-bin-macos-x64.tar.gz	< 10 hours ago	83.2 MB	0
llama-b7845-bin-macos-arm64.tar.gz	< 10 hours ago	29.5 MB	0
llama-b7845-bin-910b-openEuler-x86-aclgraph.tar.gz	< 10 hours ago	59.5 MB	0
llama-b7845-bin-910b-openEuler-aarch64-aclgraph.tar.gz	< 10 hours ago	53.8 MB	0
llama-b7845-bin-310p-openEuler-x86.tar.gz	< 10 hours ago	59.5 MB	0
llama-b7845-bin-310p-openEuler-aarch64.tar.gz	< 10 hours ago	53.8 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	< 10 hours ago	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	< 10 hours ago	391.4 MB	0
b7845 source code.tar.gz	< 12 hours ago	28.8 MB	0
b7845 source code.zip	< 12 hours ago	29.8 MB	0
README.md	< 12 hours ago	3.5 kB	0
Totals: 23 Items		2.4 GB	0

ggml-cpu: aarm64: q6_K repack gemm and gemv (and generic) implementations (i8mm) [#18860] (#18888) * Boilerplate for q6_K repack * q6_K repack to q6_Kx8 implementation Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> * q6_K generic gemv and gemm * wip, gemm_q6_K 8x8 * Still WIP: loading of q8s, q6h and q6l * first working version of q6_K gemm * Moved q6 loads outside of sb block, Unrolled inner loop * Replaced modulo with mask * First implementation of GEMV * ggml_vdotq_s32 -> vdotq_s32 * Reduce width of accumulators in q6_K gemv * Bsums instead of calc bias. Preload scales to use vget_lane. Unroll. * Reuse scales in GEMM (same GEMV opt) * Added todos for bsum and different qh repack * Arch fallback * VSLIQ for merging qh adn ql * Removed TODO, already tested * Apply suggestions Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Removed unused import --------- Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Source: README.md, updated 2026-01-27