feat: compress skill files using caveman mode to reduce per-session token cost

Apra Fleet is an open-source MCP server

Brought to you by: apralabs

#204 feat: compress skill files using caveman mode to reduce per-session token cost

Status: open

Owner: nobody

Labels: None

Updated: 2026-04-28

Created: 2026-04-28

Creator: Anonymous

Private: No

Originally created by: kumaakh

Problem

Fleet skill files (skills/pm/.md, skills/fleet/.md — 28 files) are loaded as input tokens on every PM/fleet invocation. As session length grows, each file also re-enters context on every turn. Current size is unoptimised prose.

Goal

Apply caveman-style compression to all 28 skill files, targeting 40–60% token reduction without breaking behaviour.

Approach

Install caveman tooling — add JuliusBrussee/caveman as a Claude Code skill
Compress all skill files — run caveman pass on every file in skills/pm/ and skills/fleet/, including tpl-*.md (all files are consumed exclusively by LLMs — no human-readability constraint)
Risk review — use caveman's review/risk-identification tools to flag any compressed passages where meaning was lost or instructions became ambiguous
Regression test — run a representative set of PM commands against the compressed skills and verify behaviour is unchanged

Files in scope

All 28 files under skills/pm/ and skills/fleet/ — no exceptions:

Operational: SKILL.md ×2, single-pair-sprint.md, multi-pair-sprint.md, simple-sprint.md, doer-reviewer.md, cleanup.md, init.md, context-file.md, plan-prompt.md, onboarding.md, permissions.md, troubleshooting.md, skill-matrix.md, auth-github.md, auth-bitbucket.md, auth-azdevops.md
Templates: tpl-doer.md, tpl-reviewer.md, tpl-reviewer-plan.md, tpl-plan.md, tpl-deploy.md, tpl-design.md, tpl-requirements.md, tpl-status.md, tpl-backlog.md, tpl-projects.md, tpl-pm.md

Expected outcome

40–60% token reduction on skill file input per session
No change in PM/fleet behaviour
Risk review sign-off before merge

Notes

tpl-*.md files are only ever read by LLMs — no human-readability constraint, compress equally
Output compression of responses is a secondary benefit (saves future input tokens in multi-turn sessions, not current output spend)
Priority: high — these files load on every invocation across all active projects

feat: compress skill files using caveman mode to reduce per-session token cost

Apra Fleet is an open-source MCP server

Milestone

Searches

Help