Harness-1
Ultra Recipe for Training Long-Horizon Search Agents
Harness-1 is a 20B search agent trained with reinforcement learning inside a stateful retrieval harness. It is designed for long-horizon search tasks where the model must search, inspect documents, curate evidence, verify claims, and decide when enough evidence has been gathered. The harness externalizes search state, including candidate documents, evidence links, verification records, and budget-aware context. This lets the policy focus on higher-level decisions instead of trying to keep...