Agent Reinforcement Trainer, or ART is an open-source reinforcement learning framework tailored to training large language model agents through experience, making them more reliable and performant on multi-turn, multi-step tasks. Instead of just manually crafting prompts or relying on supervised fine-tuning, ART uses techniques like Group Relative Policy Optimization (GRPO) to let agents learn from environmental feedback and reward signals. The framework is designed to integrate easily with Python applications, abstracting much of the RL infrastructure so developers can train agents without deep RL expertise or heavy infrastructure overhead. ART also supports scalable training patterns, observability tools, and integration with hosted platforms like Weights & Biases, and it provides notebooks that demonstrate training on standard benchmarks and tasks.
Features
- Reinforcement learning training for multi-step agents
- Group Relative Policy Optimization (GRPO) support
- Python-friendly API and integration model
- Infrastructure abstraction for easier deployment
- Support for monitoring with tools like W&B
- Example notebooks demonstrating tasks and training