| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 3.0.0 source code.tar.gz | 2026-05-08 | 3.9 MB | |
| 3.0.0 source code.zip | 2026-05-08 | 3.9 MB | |
| README.md | 2026-05-08 | 820 Bytes | |
| Totals: 3 Items | 7.8 MB | 0 | |
Wllama version 3.0 is out - with multimodal and tool calling support 🚀🚀
V3.0 is a major architectural overhaul that replaces the custom wllama core with server-context, the inference component from llama-server. Key highlights:
- 🔥 Full OAI-compatible API:
createChatCompletion,createCompletion,createEmbedding - 🖼️ Multimodal support (vision/audio inputs)
- 🔨 Native tool calling support
- 🥷 Jinja-based chat template parsing (same as llama-server)
View full release note here: https://github.com/ngxson/wllama/blob/master/guides/intro-v3.md
What's Changed
- Reuse llama-server source code (v3.0.0 - huge breaking changes ahead!) by @ngxson in https://github.com/ngxson/wllama/pull/213
Full Changelog: https://github.com/ngxson/wllama/compare/2.4.0...3.0.0