Download Latest Version 3.1.1 source code.tar.gz (2.6 MB)
Email in envelope

Get an email when there's a new version of wllama

Home / 3.0.0
Name Modified Size InfoDownloads / Week
Parent folder
3.0.0 source code.tar.gz 2026-05-08 3.9 MB
3.0.0 source code.zip 2026-05-08 3.9 MB
README.md 2026-05-08 820 Bytes
Totals: 3 Items   7.8 MB 0

Wllama version 3.0 is out - with multimodal and tool calling support 🚀🚀

V3.0 is a major architectural overhaul that replaces the custom wllama core with server-context, the inference component from llama-server. Key highlights:

  • 🔥 Full OAI-compatible API: createChatCompletion, createCompletion, createEmbedding
  • 🖼️ Multimodal support (vision/audio inputs)
  • 🔨 Native tool calling support
  • 🥷 Jinja-based chat template parsing (same as llama-server)

View full release note here: https://github.com/ngxson/wllama/blob/master/guides/intro-v3.md

What's Changed

Full Changelog: https://github.com/ngxson/wllama/compare/2.4.0...3.0.0

Source: README.md, updated 2026-05-08