llama.cpp - Browse /b7436 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b7436-xcframework.zip	< 10 hours ago	149.5 MB	0
llama-b7436-xcframework.tar.gz	< 10 hours ago	149.6 MB	0
llama-b7436-bin-win-vulkan-x64.zip	< 10 hours ago	34.9 MB	0
llama-b7436-bin-win-sycl-x64.zip	< 10 hours ago	109.1 MB	0
llama-b7436-bin-win-opencl-adreno-arm64.zip	< 10 hours ago	16.8 MB	0
llama-b7436-bin-win-hip-radeon-x64.zip	< 10 hours ago	347.8 MB	0
llama-b7436-bin-win-cuda-13.1-x64.zip	< 10 hours ago	92.6 MB	0
llama-b7436-bin-win-cuda-12.4-x64.zip	< 10 hours ago	203.9 MB	0
llama-b7436-bin-win-cpu-x64.zip	< 10 hours ago	20.0 MB	0
llama-b7436-bin-win-cpu-arm64.zip	< 10 hours ago	16.2 MB	0
llama-b7436-bin-ubuntu-x64.zip	< 10 hours ago	19.0 MB	0
llama-b7436-bin-ubuntu-x64.tar.gz	< 10 hours ago	19.1 MB	0
llama-b7436-bin-ubuntu-vulkan-x64.zip	< 10 hours ago	34.5 MB	0
llama-b7436-bin-ubuntu-vulkan-x64.tar.gz	< 10 hours ago	34.5 MB	0
llama-b7436-bin-ubuntu-s390x.zip	< 10 hours ago	19.0 MB	0
llama-b7436-bin-ubuntu-s390x.tar.gz	< 10 hours ago	22.4 MB	0
llama-b7436-bin-macos-x64.zip	< 10 hours ago	42.7 MB	0
llama-b7436-bin-macos-x64.tar.gz	< 10 hours ago	42.8 MB	0
llama-b7436-bin-macos-arm64.zip	< 10 hours ago	16.6 MB	0
llama-b7436-bin-macos-arm64.tar.gz	< 10 hours ago	16.6 MB	0
llama-b7436-bin-910b-openEuler-x86.tar.gz	< 10 hours ago	47.9 MB	0
llama-b7436-bin-910b-openEuler-aarch64.tar.gz	< 10 hours ago	43.7 MB	0
llama-b7436-bin-310p-openEuler-x86.tar.gz	< 10 hours ago	47.9 MB	0
llama-b7436-bin-310p-openEuler-aarch64.tar.gz	< 10 hours ago	43.7 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	< 10 hours ago	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	< 10 hours ago	391.4 MB	0
b7436 source code.tar.gz	< 17 hours ago	28.2 MB	0
b7436 source code.zip	< 17 hours ago	29.1 MB	0
README.md	< 17 hours ago	3.8 kB	0
Totals: 29 Items		2.4 GB	0

[!WARNING] Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

server: fix crash when batch > ubatch with embeddings (#17912) * server: fix crash when batch > ubatch with embeddings (#12836) Fixes [#12836] where the server crashes with GGML_ASSERT failure when running with embeddings enabled and n_batch > n_ubatch. Root cause: Embeddings use non-causal attention which requires all tokens to be processed within a single ubatch. When n_batch > n_ubatch, the server attempts to split processing, causing assertion failure. Solution: - Add parameter validation in main() after common_params_parse() - When embeddings enabled and n_batch > n_ubatch: * Log warnings explaining the issue * Automatically set n_batch = n_ubatch * Prevent server crash This follows the approach suggested by @ggerganov in issue [#12836]. Note: This supersedes stalled PR [#12940] which attempted a runtime fix in the old examples/server/server.cpp location. This implementation validates at startup in tools/server/server.cpp (current location). Testing: - Build: Compiles successfully - Validation triggers: Warns when -b > -ub with --embedding - Auto-correction works: Adjusts n_batch = n_ubatch - No false positives: Valid params don't trigger warnings - Verified on macOS M3 Pro with embedding model * Update tools/server/server.cpp --------- Co-authored-by: ytian218 <ytian218@bloomberg.net> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

macOS/iOS: - macOS Apple Silicon (arm64) - macOS Intel (x64) - iOS XCFramework

Linux: - Ubuntu x64 (CPU) - Ubuntu x64 (Vulkan) - Ubuntu s390x (CPU)

Windows: - Windows x64 (CPU) - Windows arm64 (CPU) - Windows x64 (CUDA 12) - Windows x64 (CUDA 13) - Windows x64 (Vulkan) - Windows x64 (SYCL) - Windows x64 (HIP)

openEuler: - openEuler x86 (310p) - openEuler x86 (910b) - openEuler aarch64 (310p) - openEuler aarch64 (910b)

Source: README.md, updated 2025-12-16

llama.cpp Files

Port of Facebook's LLaMA model in C/C++

llama.cpp Files

Port of Facebook's LLaMA model in C/C++

Get an email when there's a new version of llama.cpp