llama.cpp - Browse /b8190 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llama-b8190-xcframework.zip	< 24 hours ago	169.4 MB	0
llama-b8190-bin-win-vulkan-x64.zip	< 24 hours ago	48.3 MB	0
llama-b8190-bin-win-sycl-x64.zip	< 24 hours ago	121.0 MB	0
llama-b8190-bin-win-opencl-adreno-arm64.zip	< 24 hours ago	25.6 MB	0
llama-b8190-bin-win-hip-radeon-x64.zip	< 24 hours ago	345.0 MB	0
llama-b8190-bin-win-cuda-13.1-x64.zip	< 24 hours ago	148.9 MB	0
llama-b8190-bin-win-cuda-12.4-x64.zip	< 24 hours ago	220.4 MB	0
llama-b8190-bin-win-cpu-x64.zip	< 24 hours ago	31.4 MB	0
llama-b8190-bin-win-cpu-arm64.zip	< 24 hours ago	24.7 MB	0
llama-b8190-bin-ubuntu-x64.tar.gz	< 24 hours ago	25.1 MB	0
llama-b8190-bin-ubuntu-vulkan-x64.tar.gz	< 24 hours ago	42.3 MB	0
llama-b8190-bin-ubuntu-s390x.tar.gz	< 24 hours ago	26.2 MB	0
llama-b8190-bin-ubuntu-rocm-7.2-x64.tar.gz	< 24 hours ago	145.2 MB	0
llama-b8190-bin-macos-x64.tar.gz	< 24 hours ago	88.5 MB	0
llama-b8190-bin-macos-arm64.tar.gz	< 24 hours ago	30.7 MB	0
llama-b8190-bin-910b-openEuler-x86-aclgraph.tar.gz	< 24 hours ago	63.1 MB	0
llama-b8190-bin-910b-openEuler-aarch64-aclgraph.tar.gz	< 24 hours ago	57.0 MB	0
llama-b8190-bin-310p-openEuler-x86.tar.gz	< 24 hours ago	63.1 MB	0
llama-b8190-bin-310p-openEuler-aarch64.tar.gz	< 24 hours ago	57.0 MB	0
cudart-llama-bin-win-cuda-13.1-x64.zip	< 24 hours ago	402.6 MB	0
cudart-llama-bin-win-cuda-12.4-x64.zip	< 24 hours ago	391.4 MB	0
b8190 source code.tar.gz	2026-03-03	29.1 MB	0
b8190 source code.zip	2026-03-03	30.1 MB	0
README.md	2026-03-03	4.0 kB	0
Totals: 24 Items		2.6 GB	0

ggml webgpu: fix workgroup dispatch limit for large batch sizes (#19965) * ggml-webgpu: fix workgroup dispatch limit for large batch sizes WebGPU limits workgroup sizes to 65535 per dimension. Large MUL_MAT operations with batch sizes exceedeing this limi would fail. * add compute_2d_workgroups() helper to split total workgroup ID across X/Y dimensions * update mul_mat_reg_tile.wgsl to reconstruct linear workgroup ID from 2D dispatch * update mul_mat_subgroup_matrix.wgsl to reconstruct linear workgroup ID from 2D dispatch * update mul_mat.wgsl to compute global index from 2D workgroup coordinates * refactor all three mul_mat dispatch paths to use the shared helper * ggml-webgpu: add bounds checking for over-dispatched workgroups 2D workgroup dispatch can over-dispatch when total workgroups don't divide evenly into the 65535 per-dimension limit. Extra workgroups would compute invalid batch indices, causing memory corruption. * add batch_idx bound check to mul_mat_reg_tile.wgsl and mul_mat_subgroup_matrix.wgsl to prevent over-dispatched workgroups from accessing invalid memory * fixes test failures with large batch sizes (eg., bs=[128, 1024]) * ggml-webgpu: add back TODO for spliting large sizes into batches * Optimize 2d workgroup provisioning * Set some parameters that increase speed --------- Co-authored-by: Reese Levine <reeselevine1@gmail.com>

macOS/iOS:

Linux:

Windows: