llama.cpp - Browse /b7898 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b7898-xcframework.zip	< 5 hours ago	174.6 MB	0
llama-b7898-bin-win-vulkan-x64.zip	< 5 hours ago	47.2 MB	0
llama-b7898-bin-win-sycl-x64.zip	< 5 hours ago	120.2 MB	0
llama-b7898-bin-win-opencl-adreno-arm64.zip	< 5 hours ago	25.0 MB	0
llama-b7898-bin-win-hip-radeon-x64.zip	< 5 hours ago	365.2 MB	0
llama-b7898-bin-win-cuda-13.1-x64.zip	< 5 hours ago	146.7 MB	0
llama-b7898-bin-win-cuda-12.4-x64.zip	< 5 hours ago	216.6 MB	0
llama-b7898-bin-win-cpu-x64.zip	< 5 hours ago	30.6 MB	0
llama-b7898-bin-win-cpu-arm64.zip	< 5 hours ago	24.2 MB	0
llama-b7898-bin-ubuntu-x64.tar.gz	< 5 hours ago	24.4 MB	0
llama-b7898-bin-ubuntu-vulkan-x64.tar.gz	< 5 hours ago	41.2 MB	0
llama-b7898-bin-ubuntu-s390x.tar.gz	< 5 hours ago	25.3 MB	0
llama-b7898-bin-macos-x64.tar.gz	< 5 hours ago	84.8 MB	0
llama-b7898-bin-macos-arm64.tar.gz	< 5 hours ago	30.0 MB	0
llama-b7898-bin-910b-openEuler-x86-aclgraph.tar.gz	< 5 hours ago	61.4 MB	0
llama-b7898-bin-910b-openEuler-aarch64-aclgraph.tar.gz	< 5 hours ago	55.4 MB	0
llama-b7898-bin-310p-openEuler-x86.tar.gz	< 5 hours ago	61.4 MB	0
llama-b7898-bin-310p-openEuler-aarch64.tar.gz	< 5 hours ago	55.4 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	< 5 hours ago	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	< 5 hours ago	391.4 MB	0
b7898 source code.tar.gz	< 7 hours ago	28.9 MB	0
b7898 source code.zip	< 7 hours ago	29.9 MB	0
README.md	< 7 hours ago	3.8 kB	0
Totals: 23 Items		2.4 GB	0

ggml-hexagon: flash-attention and reduce-sum optimizations (#19141) * wip * ggml-hexagon: add vectorized dot product function for FP32 and FP16 accumulation * ggml-hexagon: optimize dot product functions for FP16 and FP32 with new vectorized implementations * wip * ggml-hexagon: optimize hvx_vec_dump_f32_n and hvx_vec_reduce_sum_qf32x2 functions for improved performance * ggml-hexagon: refactor dot product functions to use a common loading function for improved readability * optimize vector dot product functions to use unified reduction for improved performance * wip * ggml-hexagon: add vectorized dot product function for FP32 and FP16 accumulation * ggml-hexagon: optimize dot product functions for FP16 and FP32 with new vectorized implementations * wip * ggml-hexagon: optimize hvx_vec_dump_f32_n and hvx_vec_reduce_sum_qf32x2 functions for improved performance * ggml-hexagon: refactor dot product functions to use a common loading function for improved readability * optimize vector dot product functions to use unified reduction for improved performance * hexagon: optimize reduce-sum for v75+ * hexagon: always keep row_sums in sf/fp32 * ggml-hexagon: enhance directory checks for HEXAGON_SDK_ROOT and HEXAGON_TOOLS_ROOT * fix compiling error after rebase --------- Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com>

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Source: README.md, updated 2026-01-31