| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-10-16 | 7.1 kB | |
| rLLM_ v0.2.0 source code.tar.gz | 2025-10-16 | 1.3 MB | |
| rLLM_ v0.2.0 source code.zip | 2025-10-16 | 1.5 MB | |
| Totals: 3 Items | 2.9 MB | 3 | |
rLLM v0.2: RL Training over General Agentic Programs (Blog Post)
We are excited to release rLLM v0.2, a major upgrade of our RL training framework. In v0.1, rLLM provided agent and OpenAI Gym-like environment abstractions to support training ReACT-style agents. In v0.2, we additionally introduce AgentWorkflowEngine and AgentWorkflowTrainer—more general abstractions that enable arbitrary agentic programs to be trained. Agent builders and researchers can now define multi-agent systems, complex workflows (e.g., solver-judge, planner executor, MCTS), and agentic programs with custom reward functions, and train them with reinforcement learning without rewriting their production code.
Key Features in v0.2
- Support the official
verl==0.5.0as training backend, no custom verl fork anymore!verl==0.5.0comes with support of the following features which are now supported in rLLM (@kylemontgomery1):- Megatron training support (@jeewoo-lee)
- SGLang as the rollout engine, in addition to vLLM.
- Introduce
AgentWorkflowEngine, which enables passing in arbitrary agentic programs for training. (@kylemontgomery1) - Support more agents and environments
- Terminus and TerminalBench (@JasonWei05)
- Tongyi DeepResearch agent (@yayashuxue)
- AppWorld and AppWorldReactAgent (@sunan135)
- Integration with other agentic framework/SDK
- Strands SDK from AWS
- SmolAgents
What's Changed
- fix <tool_calls_begin> variable by @wj-Mcat in https://github.com/rllm-org/rllm/pull/142
- Fix not registered license from code by @annyan09023 in https://github.com/rllm-org/rllm/pull/144
- fix r2egym import error; update installation README by @jeffreysijuntan in https://github.com/rllm-org/rllm/pull/146
- update deepscaler max_prompt_length to avoid exception during training by @jeffreysijuntan in https://github.com/rllm-org/rllm/pull/148
- fix(syntax): Resolve invalid escape sequence warnings by @tonyz0x0 in https://github.com/rllm-org/rllm/pull/154
- added Tools for SFT by @mananroongta in https://github.com/rllm-org/rllm/pull/160
- update docs by @jeffreysijuntan in https://github.com/rllm-org/rllm/pull/167
- Add dark mode to docs by @philippnormann in https://github.com/rllm-org/rllm/pull/168
- [FIX] Fix tool calling result parsing problem in tranjectory visualizer & MCP tool name fixing by @VincentXWD in https://github.com/rllm-org/rllm/pull/174
- [hotfix][miniwob] Fix gymnasium.error.NameNotFound by @abrohamLee in https://github.com/rllm-org/rllm/pull/172
- Load full DeepCoder dataset, instead of LCB subset by @mananroongta in https://github.com/rllm-org/rllm/pull/178
- [feat][docker] Installation with Docker by @abrohamLee in https://github.com/rllm-org/rllm/pull/177
- Add macOS compatibility: exclude GPU dependencies on darwin by @yayashuxue in https://github.com/rllm-org/rllm/pull/180
- Torch 2.7.0 only compatible with MacOS python=3.11 by @yayashuxue in https://github.com/rllm-org/rllm/pull/184
- Migrate to verl v0.5.0 by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/193
- Terminal Bench Integration into rLLM (Simplified) by @JasonWei05 in https://github.com/rllm-org/rllm/pull/205
- feat: Integrate Strands SDK with RLLM for scalable tool-enabled agent training by @yayashuxue in https://github.com/rllm-org/rllm/pull/206
- Add VimGolf agent training example by @James4Ever0 in https://github.com/rllm-org/rllm/pull/209
- fix: update search engine source data path by @noiji in https://github.com/rllm-org/rllm/pull/216
- [feature] Adding Megatron support for v0.2 by @jeewoo-lee in https://github.com/rllm-org/rllm/pull/221
- Use RolloutEngine for single_turn_workflow.py by @1stprinciple in https://github.com/rllm-org/rllm/pull/223
- Standalone inference: remove hard verl dependency by @JasonWei05 in https://github.com/rllm-org/rllm/pull/228
- Update pyproject.toml to v0.2.0 by @NIL-zhuang in https://github.com/rllm-org/rllm/pull/229
- proper handling the case that next_observation is empty dict by @erranlli in https://github.com/rllm-org/rllm/pull/233
- [v0.2] Add lazy import to fix circular import and ray init config support by @listar2000 in https://github.com/rllm-org/rllm/pull/236
- v0.2 verl patch by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/237
- v0.2 masking/parsing fix by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/238
- v0.2 rollout upgrade by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/241
- Feat: deepresearch integration by @yayashuxue in https://github.com/rllm-org/rllm/pull/215
- workflow updates by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/244
- added colab example of solver judge by @jeewoo-lee in https://github.com/rllm-org/rllm/pull/246
- v0.2 misc changes by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/245
- Add FireworksEngine for disaggregated rollout by @1stprinciple in https://github.com/rllm-org/rllm/pull/243
- AppWorld Integration for rLLM by @sunan135 in https://github.com/rllm-org/rllm/pull/235
- V0.2 by @jeffreysijuntan in https://github.com/rllm-org/rllm/pull/247
- update solver judge workflow by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/248
- update install instructions, update solver judge notebook by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/249
New Contributors
- @wj-Mcat made their first contribution in https://github.com/rllm-org/rllm/pull/142
- @annyan09023 made their first contribution in https://github.com/rllm-org/rllm/pull/144
- @tonyz0x0 made their first contribution in https://github.com/rllm-org/rllm/pull/154
- @mananroongta made their first contribution in https://github.com/rllm-org/rllm/pull/160
- @philippnormann made their first contribution in https://github.com/rllm-org/rllm/pull/168
- @VincentXWD made their first contribution in https://github.com/rllm-org/rllm/pull/174
- @abrohamLee made their first contribution in https://github.com/rllm-org/rllm/pull/172
- @yayashuxue made their first contribution in https://github.com/rllm-org/rllm/pull/180
- @kylemontgomery1 made their first contribution in https://github.com/rllm-org/rllm/pull/193
- @JasonWei05 made their first contribution in https://github.com/rllm-org/rllm/pull/205
- @James4Ever0 made their first contribution in https://github.com/rllm-org/rllm/pull/209
- @noiji made their first contribution in https://github.com/rllm-org/rllm/pull/216
- @jeewoo-lee made their first contribution in https://github.com/rllm-org/rllm/pull/221
- @1stprinciple made their first contribution in https://github.com/rllm-org/rllm/pull/223
- @NIL-zhuang made their first contribution in https://github.com/rllm-org/rllm/pull/229
- @erranlli made their first contribution in https://github.com/rllm-org/rllm/pull/233
- @listar2000 made their first contribution in https://github.com/rllm-org/rllm/pull/236
- @sunan135 made their first contribution in https://github.com/rllm-org/rllm/pull/235
Full Changelog: https://github.com/rllm-org/rllm/commits/v0.2.0