Download Latest Version 0.16.3 source code.tar.gz (1.3 MB)
Email in envelope

Get an email when there's a new version of Distributed Llama

Home / v0.15.4
Name Modified Size InfoDownloads / Week
Parent folder
0.15.4 source code.tar.gz 2025-08-20 1.3 MB
0.15.4 source code.zip 2025-08-20 1.4 MB
README.md 2025-08-20 421 Bytes
Totals: 3 Items   2.7 MB 0

This version brings another speedup in Vulkan inference.

Prediction (--steps 128)

RTX 3090 24GB, AMD EPYC 7313 16-Core Processor https://github.com/b4rtaz/distributed-llama/pull/252

Model Tokens/s (version 0.15.1) Tokens/s (version 0.15.2) Tokens/s (version 0.15.3) Tokens/s (This version)
llama3_1_8b_instruct_q40 24.80 24.80 33.32 45.33 🚀
Source: README.md, updated 2025-08-20