| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-12-11 | 6.4 kB | |
| rLLM_ v0.2.1 source code.tar.gz | 2025-12-11 | 2.3 MB | |
| rLLM_ v0.2.1 source code.zip | 2025-12-11 | 2.6 MB | |
| Totals: 3 Items | 5.0 MB | 0 | |
rLLM v0.2.1: Tinker backend, VLM training, Eval Protocol, and SDK (preview)
We are excited to release rLLM v0.2.1. This new version comes with the following exciting features:
-
rLLM SDK (preview): The rLLM SDK enables you to transform agents written in frameworks such as LangGraph, SmolAgent, or Strands into trainable workflows. Check out this LangGraph RAG example, which builds a RAG agent and trains it with the rLLM SDK.
-
Tinker training backend: In addition to
verl, rLLM now supportsTinkeras a training backend. You can use the same abstractions for building agents and easily switch between different backends for training. -
VLM training: rLLM supports Vision-Language Model training with the
verlbackend. See the Geo3K training example for reference. -
LoRA fine-tuning: rLLM supports LoRA training in both the
verlandTinkerbackends. See the GSM8K LoRA example for how to enable LoRA training with a single config change. -
Eval Protocol Integration We integrate with the Eval Protocol from Fireworks AI. Users can now train on any environments supported by the Eval Protocol. See this example that uses Eval Protocol in rLLM to train a Frozenlake agent.
A big shoutout to @thwu1 @kylemontgomery1 @listar2000 @xzrderek for their outstanding work on these features.
What's Changed
- make rllm-specific configs applied correctly and robustly by @listar2000 in https://github.com/rllm-org/rllm/pull/256
- Ensure disable_thinking defaults to False when config is None by @Tendo33 in https://github.com/rllm-org/rllm/pull/258
- fix: circular import issues in WORKFLOW_CLASS_MAPPING by @listar2000 in https://github.com/rllm-org/rllm/pull/261
- [nightly] initialize the nightly branch by @listar2000 in https://github.com/rllm-org/rllm/pull/263
- Fix environment variable forwarding to ray runtime env by @listar2000 in https://github.com/rllm-org/rllm/pull/265
- [nightly] update recent changes on workflow engines by @listar2000 in https://github.com/rllm-org/rllm/pull/268
- Fix : Prevent KeyError in _pad_dataproto_to_world_size by @mananroongta in https://github.com/rllm-org/rllm/pull/274
- Fix retokenization by @thwu1 in https://github.com/rllm-org/rllm/pull/272
- fix controlling the n_parallel_agents and the concurrent env operations by @LianShuQuan in https://github.com/rllm-org/rllm/pull/271
- Added is_correct & reward flow through tool env by @mananroongta in https://github.com/rllm-org/rllm/pull/277
- Integrate Eval Protocol as RL environment by @1stprinciple in https://github.com/rllm-org/rllm/pull/276
- SWEEnv.from_dict() by @LianShuQuan in https://github.com/rllm-org/rllm/pull/278
- Fix: Resolve PyArrow nested data conversion error in distributed dataset loading by @erranlli in https://github.com/rllm-org/rllm/pull/281
- Per Episode Logging Feature by @qywu in https://github.com/rllm-org/rllm/pull/282
- [feature] Support Tinker as a backend by @thwu1 in https://github.com/rllm-org/rllm/pull/283
- [feat] Tinker Workflow Trainer by @thwu1 in https://github.com/rllm-org/rllm/pull/288
- Fix fireworks dependency by @listar2000 in https://github.com/rllm-org/rllm/pull/296
- Examples: fix utils import by @Flecart in https://github.com/rllm-org/rllm/pull/295
- [Refactor] Update Tinker Backend Example by @thwu1 in https://github.com/rllm-org/rllm/pull/300
- Revert "fix: Gracefully skip overlong prompts during training to prev… by @1stprinciple in https://github.com/rllm-org/rllm/pull/302
- Fixes [#303] Optimize old_log_prob computation in PPO trainer by @BabelTower in https://github.com/rllm-org/rllm/pull/304
- Bug/n parallel agents by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/307
- [nightly] merge recent updates in main back to nightly by @listar2000 in https://github.com/rllm-org/rllm/pull/308
- Adding generic Eval Protocol environments to rLLM by @xzrderek in https://github.com/rllm-org/rllm/pull/306
- [feat] sdk by @thwu1 in https://github.com/rllm-org/rllm/pull/310
- Multimodal by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/315
- add rllm docs by @xzrderek in https://github.com/rllm-org/rllm/pull/312
- Fix import problem of megatron ray worker group by @listar2000 in https://github.com/rllm-org/rllm/pull/319
- Fix color print display issue by @listar2000 in https://github.com/rllm-org/rllm/pull/317
- [feat] Intergrate OpenTelemetry by @thwu1 in https://github.com/rllm-org/rllm/pull/320
- Remove unnecessary free_cache_engine checks. by @listar2000 in https://github.com/rllm-org/rllm/pull/324
- add vlm docs by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/326
- [feat] Importance Sampling by @thwu1 in https://github.com/rllm-org/rllm/pull/332
- Fix repetitive application id causing vLLM issue by @listar2000 in https://github.com/rllm-org/rllm/pull/334
- [feat] Add Langgraph Training Example, Fix bugs, Refactor Sdk by @thwu1 in https://github.com/rllm-org/rllm/pull/335
- Add Sdk Doc by @thwu1 in https://github.com/rllm-org/rllm/pull/339
- [feature] simplified deps by @kylemontgomery1 in https://github.com/rllm-org/rllm/pull/327
- Add gsm8k-lora script by @listar2000 in https://github.com/rllm-org/rllm/pull/342
- [v0.2.1] Merge nightly into main for rLLM v0.2.1 by @jeffreysijuntan in https://github.com/rllm-org/rllm/pull/341
New Contributors
- @Tendo33 made their first contribution in https://github.com/rllm-org/rllm/pull/258
- @thwu1 made their first contribution in https://github.com/rllm-org/rllm/pull/272
- @LianShuQuan made their first contribution in https://github.com/rllm-org/rllm/pull/271
- @qywu made their first contribution in https://github.com/rllm-org/rllm/pull/282
- @Flecart made their first contribution in https://github.com/rllm-org/rllm/pull/295
- @BabelTower made their first contribution in https://github.com/rllm-org/rllm/pull/304
- @xzrderek made their first contribution in https://github.com/rllm-org/rllm/pull/306
Full Changelog: https://github.com/rllm-org/rllm/compare/v0.2.0...v0.2.1