Releases: volcengine/verl
v0.2.0.post2
What's Changed
- Fixed installation issues.
- Fixed the remove padding flags in the gemma example.
New Contributors
Full Changelog: v0.2...v0.2.0.post2
v0.2 release
Highlights
New algorithms and features
- GRPO
- ReMax
- REINFORCE++
- Checkpoint manager for FSDP backend
- Sandbox for reward verification and scoring in PRIME
Performance optimization:
- Remove padding tokens (i.e. sequence packing). Significant throughput increase expected for Llama, Mistral, Gemma, Qwen2 transformer models. Documentation
actor_rollout_ref.model.use_remove_padding=True
critic.model.use_remove_padding=True
- Dynamic batch size. Significant throughput increase for variable length sequences. Documentation and example
actor_rollout_ref.actor.ppo_max_token_len_per_gpu
actor_rollout_ref.rollout.log_prob_max_token_len_per_gpu
actor_rollout_ref.ref.log_prob_max_token_len_per_gpu
critic.ppo_max_token_len_per_gpu
critic.forward_micro_batch_size_per_gpu
reward_model.forward_micro_batch_size_per_gpu
- Sequence parallelism for long context training. Documentation and example
actor_rollout_ref.actor.ulysses_sequence_parallel_size
critic.ulysses_sequence_parallel_size
reward_model.ulysses_sequence_parallel_size
- vllm v0.7+ integration (preview). For the qwen2 ppo example, 25% time reduction in rollout compared to v0.6.3, and 45% time reduction when cuda graph is enabled. Documentation
actor_rollout_ref.rollout.enforce_eager=False
actor_rollout_ref.rollout.free_cache_engine=False
- Liger-kernel integration for SFT. Documentation
model.use_liger=True
Changelog
New Features
-
Algorithm Support:
-
Performance Improvements:
- Enabled dynamic batch size support (#118).
- Added meta device initialization and parallel load for FSDP to avoid OOMs during init (#123).
- Improved gradient accumulation in sequence balance (#141).
- Added ref/RM offload support (#121).
- Added LoRA support for SFT (#127).
- feat: spport rmpad/data-packing in FSDP with transformers (#91)
- Liger kernel integration (#133)
-
Experiment Tracking:
Bug Fixes
-
Critical Fixes:
-
Code Fixes:
Improvements
-
Performance:
-
Miscellaneous:
- Added option to log validation generations to wandb (#177).
Deprecations and Breaking Changes
- Breaking Changes:
Contributors
A big thank you to all the contributors who made this release possible:
@zhanluxianshen @xingyaoww @fzyzcjy @emergenz @openhands-agent @ZSL98 @YSLIU627 @ZefanW @corbt @jaysonfrancis @hiyouga @Jiayi-Pan @hongpeng-guo @eltociear @chujiezheng @PanAndy @zwhe99 @pcmoritz @huiyeruzhou @VPeterV @uygnef @zhiqi-0 @ExtremeViscent @liziniu @nch0w @Cppowboy @TonyLianLong @4332001876 @tyler-romero @ShaohonChen @kinman0224 @willem-bd @bebetterest @WeiXiongUST @dignfei
Pypi package will be soon available! Please let us know on Github if there's a problem extending RL training recipe based on the pip installed version fo verl.
Full Changelog: v0.1...v0.2
v0.1
What's Changed
- [misc] feat: update tutorial for opensource version by @PeterSH6 in #4
- [misc] fix: vllm gpu executor issue when world_size is 1 and typo in doc by @PeterSH6 in #9
- [ci] feat: add test files for ray hybrid programming model by @PeterSH6 in #23
- [chore] remove unnecessary updating of
_worker_names
by @kevin85421 in #19 - [misc] feat: add gemma example for small scale debug and fix gradient checkpoint in critic by @PeterSH6 in #27
- [misc] fix issue in hf_weight_loader and fix typo in doc by @PeterSH6 in #30
- [ci] test lint ci and lint tests dir by @PeterSH6 in #28
- [example] fix: fix math circular dependency by @eric-haibin-lin in #31
- [example] fix: make wandb optional dependency. allow extra args in existing scripts by @eric-haibin-lin in #32
- [docs] feat: add related publications by @eric-haibin-lin in #35
- [tokenizer] feat: support tokenizers whose pad_token_id is none by @eric-haibin-lin in #36
- [rollout] feat: support vLLM v0.6.3 and fix hf rollout import issue by @PeterSH6 in #33
- [distro] feat: add docker support by @eric-haibin-lin in #41
- [example] add a split placement tutorial by @PeterSH6 in #43
- [doc] add a new quickstart section by @PeterSH6 in #44
- [BREAKING][core] move single_controller into verl directory by @PeterSH6 in #45
New Contributors
- @eric-haibin-lin made their first contribution in #31
Full Changelog: v0.1rc...v0.1
v0.1rc
What's Changed
- [init] feat: first commit for open source
- [doc] feat: fix typo and delete deprecated config element by @PeterSH6 in #2
- [misc] fix: resolve pypi missing directory by @PeterSH6 in #3
Credit To
@PeterSH6 @vermouth1992 @zw0610 @wuxibin89 @YipZLF @namizzz @pengyanghua @eric-haibin-lin @Meteorix and others in Seed Foundation MLSys Team