Download Latest Version v0.7.1 source code.zip (2.8 MB)
Email in envelope

Get an email when there's a new version of verl

Home / v0.7.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-01-05 41.4 kB
v0.7.0 source code.tar.gz 2026-01-05 2.6 MB
v0.7.0 source code.zip 2026-01-05 3.6 MB
Totals: 3 Items   6.3 MB 2

v0.7 release

Blog post: verl 0.7 release blog

Highlight

Model Engine

  • Integrate Megatron-Bridge and support LoRA/PEFT, see blog post: How We Build Trillion Parameter Reasoning RL with 10% GPUs
  • Support experimental fp8 training for megatron backend
  • Support new model for megatron backend: GPT-OSS, Qwen3-Next
  • Comprehensive support for new mode engine, FSDP and Megatron engine are production ready.
  • Dispatch tensordict with nested tensor instead of padded DataProto
  • Add TrainingWorker that resembles Tinker-like API
  • Add VLM support for model engine, SFT and RL trainer
  • Add model engine based critic model
  • Implement ActorRolloutRefWorker by TrainingWorker, support different backend in one worker
  • New VeOmni engine added, still in alpha status.

Rollout Engine

  • Remove SPMD rollout mode
  • Support blockwise fp8 rollout for vllm and sglang; support online quant for vllm with torchao
  • Experimental router replay support for vllm
  • Optimize multi-modal data fetch and preprocess, support video input
  • Upgrade to vllm==0.12.0; sglang==0.5.6

Reward

  • Support hybrid reward scenarios, including generative, discriminative, rule-based rewards, and their combinations.
  • Refactor reward models into server mode, supporting both colocated and standalone deployments.
  • Introduce new reward managers to handle more complex scenarios, limited mode for request rate control and remote mode for CPU-intensive tasks.

Algorithm

  • Add CISPO: Clipped IS-weight Policy Optimization
  • Add SAPO: Soft Adaptive Policy Optimization

Recipe

  • [NEW] VLA: add experimental support for VLA model
  • [NEW] rhymerl: History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL
  • TransferQueue: support multiple data partition and optimize tensor zero-copy serialization
  • One-step-off-policy/Fully async: optimize weight synchronization by checkpoint engine with bucket and pipeline support.

What's Changed

New Contributors

Full Changelog: https://github.com/volcengine/verl/compare/v0.6.1...v0.7.0

Source: README.md, updated 2026-01-05