ZeroGPU Router is a task router for AI agents that sends simple, focused workloads to smaller specialized models. It is designed to reduce inference costs without replacing the main reasoning model used by the agent. The router exposes tools for tasks such as summarization, classification, PII redaction, entity extraction, JSON extraction, follow-up generation, and short chat. It works through MCP for OpenClaw and through a CLI plus plugin workflow for Claude Code. Each routed call can report useful metadata such as the selected model, latency, and estimated savings. Its main value is letting agents reserve expensive frontier models for complex reasoning while offloading routine language tasks to cheaper specialized routes.
Features
- Cost-saving AI task routing
- MCP integration for OpenClaw
- Claude Code plugin support
- Summarization and classification routes
- PII redaction and entity extraction
- Per-call latency and savings metadata