Agentex
Open source codebase for Scale Agentex
...It also includes evaluation harnesses that capture success criteria and partial credit, plus traces you can inspect to understand where reasoning or tool use failed. The design encourages clean separation between experiment configuration and code, which makes sharing results or re-running baselines straightforward. Teams use it to progress from prototypes to production-ready agent behaviors by iterating on prompts, adding tools, and validating improvements with consistent metrics.