Download Latest Version v0.4.0 -- QAT Gemma 4 default models source code.tar.gz (5.4 MB)
Email in envelope

Get an email when there's a new version of AgentHandover

Home / v0.4.0
Name Modified Size InfoDownloads / Week
Parent folder
AgentHandover-0.4.0.pkg 2026-06-08 10.5 MB
README.md 2026-06-08 2.6 kB
v0.4.0 -- QAT Gemma 4 default models source code.tar.gz 2026-06-08 5.4 MB
v0.4.0 -- QAT Gemma 4 default models source code.zip 2026-06-08 5.6 MB
Totals: 4 Items   21.5 MB 0

New default local models — Google's QAT (Quantization-Aware Training) Gemma 4 checkpoints, chosen by a head-to-head A/B test on a real focus recording, not by spec sheets.

The test

I built a harness that replays a real recorded session (dailynews) through the exact pipeline, swapping only the model and holding the screen annotations fixed — so the comparison isolates how well each model reasons about what you did. Two runs per model.

16 GB tier (Gemma 4 E4B vs Gemma 4 12B QAT): E4B couldn't even identify what the workflow was — it returned "Unclear … the user does not complete a final artifact" on both runs. The 12B QAT model nailed it both runs — "Drafts a daily news digest email, subject 'Thursday Daily News', aggregating updates from X, Reddit, and Hacker News" — extracted a typed variable, and produced goal-directed steps. A 12B-dense model now fits where a 4B-effective one used to, and reasons dramatically better about the why.

8 GB tier (Qwen 3.5:4b vs Gemma 4 e2b QAT): the opposite — Qwen correctly identified the task both runs; the 2 B e2b got it wrong both runs (and showed a brace bug). Small models lose the plot, so the 8 GB tier stays on Qwen rather than regress the most constrained users.

New tier table

RAM Was Now Download
8 GB Qwen 3.5 Qwen 3.5 (unchanged) ~6 GB
16 GB Gemma 4 E4B Gemma 4 12B QAT ~7 GB (was ~10)
24 GB Gemma 4 E4B Q8 Gemma 4 12B QAT ~7 GB (was ~12)
48 GB+ Gemma 4 31B Gemma 4 31B QAT ~18 GB (was ~20)

Requires Ollama 0.30.6+

The QAT tags need it (older Ollama returns HTTP 412 on pull). Install the official Ollama app (ollama.com/download or brew install --cask ollama-app). Heads-up: the Homebrew ollama formula does not bundle the GGUF runner and cannot run these models — use the official app. The 8 GB Qwen tier still works on older Ollama.

Existing installs keep their current model until re-onboarded; new installs and agenthandover setup --vlm get the new defaults.

Install

  • Direct: AgentHandover-0.4.0.pkg (signed + notarized + stapled)
  • Homebrew: brew install --cask sandroandric/agenthandover/agenthandover
  • Sparkle auto-update: existing users prompted on next check.

SHA-256: 4c7c270afd5bac741726b9bc10b30a598f348225a7ea78e7803d3ab7753535df

3026/3026 Python tests pass.

Source: README.md, updated 2026-06-08