| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| AgentHandover-0.4.0.pkg | 2026-06-08 | 10.5 MB | |
| README.md | 2026-06-08 | 2.6 kB | |
| v0.4.0 -- QAT Gemma 4 default models source code.tar.gz | 2026-06-08 | 5.4 MB | |
| v0.4.0 -- QAT Gemma 4 default models source code.zip | 2026-06-08 | 5.6 MB | |
| Totals: 4 Items | 21.5 MB | 0 | |
New default local models — Google's QAT (Quantization-Aware Training) Gemma 4 checkpoints, chosen by a head-to-head A/B test on a real focus recording, not by spec sheets.
The test
I built a harness that replays a real recorded session (dailynews) through the exact pipeline, swapping only the model and holding the screen annotations fixed — so the comparison isolates how well each model reasons about what you did. Two runs per model.
16 GB tier (Gemma 4 E4B vs Gemma 4 12B QAT): E4B couldn't even identify what the workflow was — it returned "Unclear … the user does not complete a final artifact" on both runs. The 12B QAT model nailed it both runs — "Drafts a daily news digest email, subject 'Thursday Daily News', aggregating updates from X, Reddit, and Hacker News" — extracted a typed variable, and produced goal-directed steps. A 12B-dense model now fits where a 4B-effective one used to, and reasons dramatically better about the why.
8 GB tier (Qwen 3.5:4b vs Gemma 4 e2b QAT): the opposite — Qwen correctly identified the task both runs; the 2 B e2b got it wrong both runs (and showed a brace bug). Small models lose the plot, so the 8 GB tier stays on Qwen rather than regress the most constrained users.
New tier table
| RAM | Was | Now | Download |
|---|---|---|---|
| 8 GB | Qwen 3.5 | Qwen 3.5 (unchanged) | ~6 GB |
| 16 GB | Gemma 4 E4B | Gemma 4 12B QAT | ~7 GB (was ~10) |
| 24 GB | Gemma 4 E4B Q8 | Gemma 4 12B QAT | ~7 GB (was ~12) |
| 48 GB+ | Gemma 4 31B | Gemma 4 31B QAT | ~18 GB (was ~20) |
Requires Ollama 0.30.6+
The QAT tags need it (older Ollama returns HTTP 412 on pull). Install the official Ollama app (ollama.com/download or brew install --cask ollama-app). Heads-up: the Homebrew ollama formula does not bundle the GGUF runner and cannot run these models — use the official app. The 8 GB Qwen tier still works on older Ollama.
Existing installs keep their current model until re-onboarded; new installs and agenthandover setup --vlm get the new defaults.
Install
- Direct:
AgentHandover-0.4.0.pkg(signed + notarized + stapled) - Homebrew:
brew install --cask sandroandric/agenthandover/agenthandover - Sparkle auto-update: existing users prompted on next check.
SHA-256: 4c7c270afd5bac741726b9bc10b30a598f348225a7ea78e7803d3ab7753535df
3026/3026 Python tests pass.