Transformer Reinforcement Learning X - Browse /v0.6.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2023-03-31	3.9 kB	0
v0.6.0_ LLaMa (Alpaca), Benchmark Util, T5 ILQL, Tests source code.tar.gz	2023-03-31	255.9 kB	0
v0.6.0_ LLaMa (Alpaca), Benchmark Util, T5 ILQL, Tests source code.zip	2023-03-31	309.9 kB	0
Totals: 3 Items		569.7 kB	0

The v0.6.0 release includes several new features, bug fixes, and overall improvements to the codebase. Here are the key changes:

📏 Benchmarking and Improved Unit Tests

This release introduces a new benchmark util to more easily track regressions in our training pipeline along with improved unit tests with the help of the hypothesis package: * [feat] Add benchmark tools by @reciprocated in https://github.com/CarperAI/trlx/pull/357 * Add hypothesis tests for ILQL and fix edge cases by @cat-state in https://github.com/CarperAI/trlx/pull/370

🦙 LLaMa and Alpaca PPO/SFT Support

PPO support and examples for LLaMa are now available and we’ve baked in an example for instruction fine-tuning models with the Alpaca dataset using our SFT trainer: * [feat] Add LLaMa Model support for PPO by @PhungVanDuy in https://github.com/CarperAI/trlx/pull/375 * Add Alpaca by @cat-state in https://github.com/CarperAI/trlx/pull/400

5️⃣ T5 ILQL Support

T5 models can now be fine-tuned with ILQL: * Support ILQL for T5 model, Fix PPO T5 for refactored code by @PhungVanDuy in https://github.com/CarperAI/trlx/pull/290

Fixes

Remove example usage of deprecating trlx.train dataset arg by @jon-tow in https://github.com/CarperAI/trlx/pull/331
Remove logit_mask unused argument by @cat-state in https://github.com/CarperAI/trlx/pull/332
[fix] Convert the rest of configs from ymls by @reciprocated in https://github.com/CarperAI/trlx/pull/346
fix default_ilql_config in notebook by @xu-song in https://github.com/CarperAI/trlx/pull/350
hot-fix: update PPOConfig import in examples by @jon-tow in https://github.com/CarperAI/trlx/pull/352
[fix] Update AdaptiveKLController with correct KL by @reciprocated in https://github.com/CarperAI/trlx/pull/361
[fix] Drop <eos> from ILQL sample's phrases by @reciprocated in https://github.com/CarperAI/trlx/pull/362
fixes half exp not implemented error by @Dahoas in https://github.com/CarperAI/trlx/pull/363
[fix] ILQL total_steps calculation when running distributed by @reciprocated in https://github.com/CarperAI/trlx/pull/374
[fix] split for validation by @hzwer in https://github.com/CarperAI/trlx/pull/369
fix(docs): Update incorrect PPORLElement logprob tensor shape hint by @jon-tow in https://github.com/CarperAI/trlx/pull/377
[fix] Enable HF downloads from a revision by @reciprocated in https://github.com/CarperAI/trlx/pull/382
[fix] Fix ILQL head sync under ZeRO3 by @reciprocated in https://github.com/CarperAI/trlx/pull/387
[fix] Preserve <eos> token and in-place it after trimming by @reciprocated in https://github.com/CarperAI/trlx/pull/401
Nemo ILQL fixes by @cat-state in https://github.com/CarperAI/trlx/pull/404

What's Changed

Move to Python config classes instead of ymls by @cat-state in https://github.com/CarperAI/trlx/pull/306
Add intermediate checkpointing to accelerate trainers by @jon-tow in https://github.com/CarperAI/trlx/pull/349
Enable infinite dataloader for prompt_dataloader in PPO Trainer by @alexandremuzio in https://github.com/CarperAI/trlx/pull/358
[feat] Add optional dependency list by @reciprocated in https://github.com/CarperAI/trlx/pull/381
Add some synchronization to the db download in the simulacra example by @dakinggg in https://github.com/CarperAI/trlx/pull/406

New Contributors

@xu-song made their first contribution in https://github.com/CarperAI/trlx/pull/350
@hzwer made their first contribution in https://github.com/CarperAI/trlx/pull/369
@alexandremuzio made their first contribution in https://github.com/CarperAI/trlx/pull/358
@dakinggg made their first contribution in https://github.com/CarperAI/trlx/pull/406

Full Changelog: https://github.com/CarperAI/trlx/compare/v0.5.0...v0.6.0

Source: README.md, updated 2023-03-31

Transformer Reinforcement Learning X Files

A repo for distributed training of language models with Reinforcement

📏 Benchmarking and Improved Unit Tests

🦙 LLaMa and Alpaca PPO/SFT Support

5️⃣ T5 ILQL Support

Fixes

What's Changed

New Contributors

Transformer Reinforcement Learning X Files

A repo for distributed training of language models with Reinforcement

Get an email when there's a new version of Transformer Reinforcement Learning X

📏 Benchmarking and Improved Unit Tests

🦙 LLaMa and Alpaca PPO/SFT Support

5️⃣ T5 ILQL Support

Fixes

What's Changed

New Contributors