Download Latest Version node-llama-cpp-electron-example.Linux.3.18.1.arm64.tar.gz (149.4 MB)
Email in envelope

Get an email when there's a new version of node-llama-cpp

Home / v3.18.0
Name Modified Size InfoDownloads / Week
Parent folder
node-llama-cpp-electron-example.Linux.3.18.0.x64.tar.gz 2026-03-15 301.6 MB
node-llama-cpp-electron-example.Linux.3.18.0.arm64.tar.gz 2026-03-15 149.4 MB
node-llama-cpp-electron-example.Linux.3.18.0.arm64.deb 2026-03-15 122.0 MB
node-llama-cpp-electron-example.Linux.3.18.0.amd64.deb 2026-03-15 256.0 MB
node-llama-cpp-electron-example.Linux.3.18.0.amd64.snap 2026-03-15 268.5 MB
node-llama-cpp-electron-example.Linux.3.18.0.x86_64.AppImage 2026-03-15 300.4 MB
node-llama-cpp-electron-example.Linux.3.18.0.arm64.AppImage 2026-03-15 157.3 MB
node-llama-cpp-electron-example.macOS.3.18.0.x64.zip 2026-03-15 160.0 MB
node-llama-cpp-electron-example.macOS.3.18.0.arm64.zip 2026-03-15 147.7 MB
node-llama-cpp-electron-example.macOS.3.18.0.x64.dmg 2026-03-15 165.7 MB
node-llama-cpp-electron-example.macOS.3.18.0.arm64.dmg 2026-03-15 153.1 MB
node-llama-cpp-electron-example.Windows.3.18.0.x64.exe 2026-03-15 370.4 MB
node-llama-cpp-electron-example.Windows.3.18.0.arm64.exe 2026-03-15 134.5 MB
node-llama-cpp-electron-example.Windows.3.18.0.exe 2026-03-15 504.2 MB
README.md 2026-03-15 2.8 kB
v3.18.0 source code.tar.gz 2026-03-15 21.9 MB
v3.18.0 source code.zip 2026-03-15 22.3 MB
Totals: 17 Items   3.2 GB 0

3.18.0 (2026-03-15)

Features

  • automatic checkpoints for models that need it (#573) (c641959)
  • QwenChatWrapper: Qwen 3.5 support (#573) (c641959)
  • inspect gpu command: detect and report missing prebuilt binary modules and custom npm registry (#573) (c641959)

Bug Fixes

  • resolveModelFile: deduplicate concurrent downloads (#570) (cc105b9)
  • correct Vulkan URL casing in documentation links (#568) (5a44506)
  • Qwen 3.5 memory estimation (#573) (c641959)
  • grammar use with HarmonyChatWrapper (#573) (c641959)
  • add mistral think segment detection (#573) (c641959)
  • compress excessively long segments from the current response on context shift instead of throwing an error (#573) (c641959)
  • default thinking budget to 75% of the context size to prevent low-quality responses (#573) (c641959)

Shipped with llama.cpp release b8352

To use the latest llama.cpp release available, run npx -n node-llama-cpp source download --release latest. (learn more)

Source: README.md, updated 2026-03-15