llama.cpp - Browse /b7833 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b7833-xcframework.zip	2026-01-25	174.3 MB	0
llama-b7833-bin-win-vulkan-x64.zip	2026-01-25	46.7 MB	0
llama-b7833-bin-win-sycl-x64.zip	2026-01-25	119.7 MB	0
llama-b7833-bin-win-opencl-adreno-arm64.zip	2026-01-25	24.6 MB	0
llama-b7833-bin-win-hip-radeon-x64.zip	2026-01-25	361.2 MB	0
llama-b7833-bin-win-cuda-13.1-x64.zip	2026-01-25	144.6 MB	0
llama-b7833-bin-win-cuda-12.4-x64.zip	2026-01-25	219.5 MB	0
llama-b7833-bin-win-cpu-x64.zip	2026-01-25	30.2 MB	0
llama-b7833-bin-win-cpu-arm64.zip	2026-01-25	23.8 MB	7
llama-b7833-bin-ubuntu-x64.tar.gz	2026-01-25	23.8 MB	0
llama-b7833-bin-ubuntu-vulkan-x64.tar.gz	2026-01-25	40.6 MB	0
llama-b7833-bin-ubuntu-s390x.tar.gz	2026-01-25	24.4 MB	0
llama-b7833-bin-macos-x64.tar.gz	2026-01-25	83.2 MB	0
llama-b7833-bin-macos-arm64.tar.gz	2026-01-25	29.5 MB	0
llama-b7833-bin-910b-openEuler-x86-aclgraph.tar.gz	2026-01-25	59.5 MB	0
llama-b7833-bin-910b-openEuler-aarch64-aclgraph.tar.gz	2026-01-25	53.8 MB	0
llama-b7833-bin-310p-openEuler-x86.tar.gz	2026-01-25	59.5 MB	0
llama-b7833-bin-310p-openEuler-aarch64.tar.gz	2026-01-25	53.8 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	2026-01-25	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	2026-01-25	391.4 MB	0
b7833 source code.tar.gz	2026-01-25	28.8 MB	0
b7833 source code.zip	2026-01-25	29.8 MB	0
README.md	2026-01-25	2.9 kB	0
Totals: 23 Items		2.4 GB	7

ggml-cpu: Use tiled FA for prompt-processing (#19012) * ggml-cpu: Use tiled FA for prompt-processing the FA performance is gimped on CPU on long contexts because it essentially uses a vector kernel. This PR adds a tiled FA for PP. Perf tuning for tile sizes done on a AMD EPYC single-socket 64-c machine. * fix out of bounds for mask * skip rows where there are all masks * skip tile if mask is inf * store mask in worksize * check inf tile earlier

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Source: README.md, updated 2026-01-25

llama.cpp Files

Port of Facebook's LLaMA model in C/C++

llama.cpp Files

Port of Facebook's LLaMA model in C/C++

Get an email when there's a new version of llama.cpp