llama.cpp - Browse /b7895 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b7895-xcframework.zip	2026-01-30	174.6 MB	0
llama-b7895-bin-win-vulkan-x64.zip	2026-01-30	47.2 MB	1
llama-b7895-bin-win-sycl-x64.zip	2026-01-30	120.2 MB	0
llama-b7895-bin-win-opencl-adreno-arm64.zip	2026-01-30	25.0 MB	0
llama-b7895-bin-win-hip-radeon-x64.zip	2026-01-30	365.2 MB	0
llama-b7895-bin-win-cuda-13.1-x64.zip	2026-01-30	146.7 MB	0
llama-b7895-bin-win-cuda-12.4-x64.zip	2026-01-30	216.6 MB	0
llama-b7895-bin-win-cpu-x64.zip	2026-01-30	30.6 MB	0
llama-b7895-bin-win-cpu-arm64.zip	2026-01-30	24.2 MB	0
llama-b7895-bin-ubuntu-x64.tar.gz	2026-01-30	24.4 MB	0
llama-b7895-bin-ubuntu-vulkan-x64.tar.gz	2026-01-30	41.2 MB	0
llama-b7895-bin-ubuntu-s390x.tar.gz	2026-01-30	25.2 MB	0
llama-b7895-bin-macos-x64.tar.gz	2026-01-30	84.8 MB	0
llama-b7895-bin-macos-arm64.tar.gz	2026-01-30	30.0 MB	0
llama-b7895-bin-910b-openEuler-x86-aclgraph.tar.gz	2026-01-30	61.4 MB	0
llama-b7895-bin-910b-openEuler-aarch64-aclgraph.tar.gz	2026-01-30	55.4 MB	0
llama-b7895-bin-310p-openEuler-x86.tar.gz	2026-01-30	61.4 MB	0
llama-b7895-bin-310p-openEuler-aarch64.tar.gz	2026-01-30	55.4 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	2026-01-30	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	2026-01-30	391.4 MB	0
b7895 source code.tar.gz	2026-01-30	28.9 MB	0
b7895 source code.zip	2026-01-30	29.9 MB	0
README.md	2026-01-30	4.3 kB	0
Totals: 23 Items		2.4 GB	1

lookup, lookahead: fix crash when n_ctx not specified (#18729) * lookup, lookahead: fix crash when n_ctx not specified Since PR [#16653] (Dec 15, 2025), the default n_ctx is 0 to enable automatic GPU memory fitting. This causes llama-lookup and llama-lookahead to crash when run without explicit -c flag: GGML_ASSERT(batch.seq_id[batch.n_tokens] && "llama_batch size exceeded") Root cause: Both examples use params.n_ctx directly for batch initialization, but params.n_ctx remains 0 even after the context is properly initialized to n_ctx_train internally. Bug history: - Nov 2023: lookahead.cpp created (PR [#4207]) with params.n_ctx pattern - Dec 2023: lookup.cpp created (PR [#4484]) with same pattern - Nov 2024: default n_ctx changed to 4096 (PR [#10136]) - bug dormant - Dec 2025: default n_ctx changed to 0 (PR [#16653]) - bug activated The bug was dormant for 2+ years because params.n_ctx defaulted to 512, then 4096. PR [#16653] changed it to 0 for GPU auto-fitting, triggering the crash. Fix: Use llama_n_ctx(ctx) to get the actual runtime context size, matching the pattern already used elsewhere in lookup.cpp (line 72) and in speculative.cpp/speculative-simple.cpp. Tested: llama-lookup now works without -c flag (12.5% acceptance on Gemma-3-1B). Note: llama-lookahead has a separate pre-existing issue with sequence initialization (n_seq_max=1 vs W+G+1 needed) that is unrelated to this fix. * lookahead: fix n_seq_max and kv_unified configuration Lookahead decoding requires: - W + G + 1 = 31 sequences for parallel Jacobi decoding - Unified KV cache for coupled sequences in batch splitting These requirements were broken after PR [#14482] changed validation logic. Consolidates fix from PR [#18730] per maintainer request. Commit message drafted with Claude.

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - CUDA 12.4 DLLs - Windows x64 (CUDA 13) - CUDA 13.1 DLLs - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b, ACL Graph) - openEuler aarch64 (310p) - openEuler aarch64 (910b, ACL Graph)

Source: README.md, updated 2026-01-30