Download Latest Version llama-b8635-bin-ubuntu-openvino-2026.0-x64.tar.gz (75.6 MB)
Email in envelope

Get an email when there's a new version of llama.cpp

Home / b8634
Name Modified Size InfoDownloads / Week
Parent folder
llama-b8634-xcframework.zip < 9 hours ago 174.9 MB
llama-b8634-bin-win-vulkan-x64.zip < 9 hours ago 56.1 MB
llama-b8634-bin-win-sycl-x64.zip < 9 hours ago 134.9 MB
llama-b8634-bin-win-opencl-adreno-arm64.zip < 9 hours ago 32.9 MB
llama-b8634-bin-win-hip-radeon-x64.zip < 9 hours ago 359.9 MB
llama-b8634-bin-win-cuda-13.1-x64.zip < 9 hours ago 167.5 MB
llama-b8634-bin-win-cuda-12.4-x64.zip < 9 hours ago 249.0 MB
llama-b8634-bin-win-cpu-x64.zip < 9 hours ago 38.9 MB
llama-b8634-bin-win-cpu-arm64.zip < 9 hours ago 31.8 MB
llama-b8634-bin-ubuntu-x64.tar.gz < 9 hours ago 31.4 MB
llama-b8634-bin-ubuntu-vulkan-x64.tar.gz < 9 hours ago 48.6 MB
llama-b8634-bin-ubuntu-vulkan-arm64.tar.gz < 9 hours ago 40.8 MB
llama-b8634-bin-ubuntu-s390x.tar.gz < 9 hours ago 35.0 MB
llama-b8634-bin-ubuntu-rocm-7.2-x64.tar.gz < 9 hours ago 159.0 MB
llama-b8634-bin-ubuntu-openvino-2026.0-x64.tar.gz < 9 hours ago 75.6 MB
llama-b8634-bin-ubuntu-arm64.tar.gz < 9 hours ago 27.6 MB
llama-b8634-bin-macos-x64.tar.gz < 9 hours ago 101.7 MB
llama-b8634-bin-macos-arm64.tar.gz < 9 hours ago 39.9 MB
llama-b8634-bin-910b-openEuler-x86-aclgraph.tar.gz < 9 hours ago 71.3 MB
llama-b8634-bin-910b-openEuler-aarch64-aclgraph.tar.gz < 9 hours ago 63.8 MB
llama-b8634-bin-310p-openEuler-x86.tar.gz < 9 hours ago 71.3 MB
llama-b8634-bin-310p-openEuler-aarch64.tar.gz < 9 hours ago 63.8 MB
cudart-llama-bin-win-cuda-13.1-x64.zip < 9 hours ago 402.6 MB
cudart-llama-bin-win-cuda-12.4-x64.zip < 9 hours ago 391.4 MB
b8634 source code.tar.gz < 13 hours ago 29.6 MB
b8634 source code.zip < 13 hours ago 30.8 MB
README.md < 13 hours ago 4.3 kB
Totals: 27 Items   2.9 GB 0
chat : add Granite 4.0 chat template with correct tool_call role mapping (#20804) * chat : add Granite 4.0 chat template with correct tool_call role mapping Introduce `LLM_CHAT_TEMPLATE_GRANITE_4_0` alongside the existing Granite 3.x template (renamed `LLM_CHAT_TEMPLATE_GRANITE_3_X`). The Granite 4.0 Jinja template uses `<tool_call>` XML tags and maps the `assistant_tool_call` role to `<|start_of_role|>assistant<|end_of_role|><|tool_call|>`. Without a matching C++ handler, the fallback path emits the literal role `assistant_tool_call` which the model does not recognize, breaking tool calling when `--jinja` is not used. Changes: - Rename `LLM_CHAT_TEMPLATE_GRANITE` to `LLM_CHAT_TEMPLATE_GRANITE_3_X` (preserves existing 3.x behavior unchanged) - Add `LLM_CHAT_TEMPLATE_GRANITE_4_0` enum, map entry, and handler - Detection: `<|start_of_role|>` + (`<tool_call>` or `<tools>`) → 4.0, otherwise → 3.x - Add production Granite 4.0 Jinja template - Add tests for both 3.x and 4.0 template paths (C++ and Jinja) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Code review: follow standard format and use common logic in test-chat-template.cpp * Rename custom_conversation variable for extra_conversation to give it a more meaningful name --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

macOS/iOS:

Linux:

Windows:

openEuler:

Source: README.md, updated 2026-04-02