Download Latest Version v0.7.0_ NeMo PPO, PEFT Migration, and Fixes source code.tar.gz (295.6 kB)
Email in envelope

Get an email when there's a new version of Transformer Reinforcement Learning X

Home / v0.6.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2023-03-31 3.9 kB
v0.6.0_ LLaMa (Alpaca), Benchmark Util, T5 ILQL, Tests source code.tar.gz 2023-03-31 255.9 kB
v0.6.0_ LLaMa (Alpaca), Benchmark Util, T5 ILQL, Tests source code.zip 2023-03-31 309.9 kB
Totals: 3 Items   569.7 kB 0

The v0.6.0 release includes several new features, bug fixes, and overall improvements to the codebase. Here are the key changes:

šŸ“ Benchmarking and Improved Unit Tests

This release introduces a new benchmark util to more easily track regressions in our training pipeline along with improved unit tests with the help of the hypothesis package: * [feat] Add benchmark tools by @reciprocated in https://github.com/CarperAI/trlx/pull/357 * Add hypothesis tests for ILQL and fix edge cases by @cat-state in https://github.com/CarperAI/trlx/pull/370

šŸ¦™ LLaMa and Alpaca PPO/SFT Support

PPO support and examples for LLaMa are now available and we’ve baked in an example for instruction fine-tuning models with the Alpaca dataset using our SFT trainer: * [feat] Add LLaMa Model support for PPO by @PhungVanDuy in https://github.com/CarperAI/trlx/pull/375 * Add Alpaca by @cat-state in https://github.com/CarperAI/trlx/pull/400

5ļøāƒ£ T5 ILQL Support

T5 models can now be fine-tuned with ILQL: * Support ILQL for T5 model, Fix PPO T5 for refactored code by @PhungVanDuy in https://github.com/CarperAI/trlx/pull/290

Fixes

What's Changed

New Contributors

Full Changelog: https://github.com/CarperAI/trlx/compare/v0.5.0...v0.6.0

Source: README.md, updated 2023-03-31