Download Latest Version v0.5.10 source code.tar.gz (1.1 MB)
Email in envelope

Get an email when there's a new version of whichllm

Home / v0.5.10
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-06-11 374 Bytes
v0.5.10 source code.tar.gz 2026-06-11 1.1 MB
v0.5.10 source code.zip 2026-06-11 1.2 MB
Totals: 3 Items   2.3 MB 0

Fixed

  • Strong partial-offload candidates no longer get buried under weaker full-GPU models because the final sort no longer counts GPU fit twice.
  • Light partial offload is penalized less aggressively, while heavy dense offload still gets a strong discount.
  • MoE partial-offload scoring now gives a milder penalty when the active working set can plausibly stay on GPU.
Source: README.md, updated 2026-06-11