| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-25 | 3.4 kB | |
| v0.8.0 source code.tar.gz | 2026-03-25 | 1.1 MB | |
| v0.8.0 source code.zip | 2026-03-25 | 1.6 MB | |
| Totals: 3 Items | 2.7 MB | 1 | |
What's changed
- Breaking: Trainer, evaluator, generator, validator, and task moved from
fairseq2.recipetofairseq2package root. (#1417) - Breaking: LM recipes restructured:
text_generaterenamed togenerate, SFT configs removed/renamed, recipe config classes changed. (#1431, [#1432], [#1433]) - Breaking:
RecipeModelis deprecated. Access the model directly via.moduleinstead. (#1403) - Breaking:
pq.ParquetDatasetreplaced withpyarrow.datasetinterface. (#1490) - Breaking:
resolve_optionalrenamed tomaybe_resolve. (#1462) - Breaking: Revised
ModelCheckpointLoaderAPI. (#1475) - Breaking: Refactored tensor sharded modules (embedding, projection, FFN, attention). (#1476)
- fsspec integration for remote filesystem support. Checkpoints can be saved to and loaded from S3 via
--checkpoint-dir s3://bucket/path/. Requiress3fs. (#1126) - New
GlobalFileSystemreplacesLocalFileSystemas default, dispatching to the appropriate backend based on URI scheme. (#1126) - PyTorch 2.9.1 and 2.10 (forward compatibility) are now supported. PyTorch 2.9 introduced breaking changes to LR scheduler return types, which have been addressed. (#1477, [#1491], [#1456])
- New context managers for procedural programming:
GangContext,DeviceContext,DataTypeContext,current_dtype. Eliminates need to pass state through nested function calls. (#1474, [#1473], [#1464]) CheckpointManager,Optimizer, andLRSchedulernow exposed inRecipeContext. (#1461)- Synchronous asset loading across ranks for models and tokenizers. Use when all ranks need identical assets loaded simultaneously. (#1429, [#1426])
CheckpointManager.register_save_hookallows custom logic during checkpoint saves. (#1439)- Config files now support
${env:<NAME>}to interpolate environment variables. (#1435) --no-richCLI flag disables rich text output for log parsing. (#1421)- Hugging Face export now runs in isolated process with saved command line and logs for debugging. (#1459, [#1458], [#1437], [#1434])
- Improved support for gated Hugging Face models. (#1422)
get_familyutility functions for detecting model families. (#1454)- Gemma3n model family (E2B/E4B) with text + audio inference and SFT training. (#1496)
- Generic HuggingFace model integration: load, shard, and train any HuggingFace CausalLM model directly through
HgCausalLMAdapterwithout requiring a native fairseq2 reimplementation. Inc ludes FSDP sharding, HF tokenizer integration, and SFT recipe support. (#1479) AssetDownloadManagergainslocal_onlyparameter and custom download subpath support. (#1423, [#1425])- Recipes now set Python
randomandnumpyseeds for reproducibility. (#1419) - Wandb metric recorder now respects wandb environment variables. (#1440)
- Improved
share_parametersimplementation. (#1484) - Fixed
cross_entropywithreduction="mean"to properly exclude padding tokens from the denominator. (#1455) - Fixed
Flash3SDPAto support theflash-attn-3v3.0.0 package API (flash_attn_3._C/torch.ops.flash_attn_3) in addition to the legacyflash_attn_3_cudamodule. (#1495) - Fixed data pipeline sampling bug when
allow_repeats=Falsewith many pipelines. (#1471) - Fixed
DataParallelFacadeweakref errors. (#1447, [#1436]) - Fixed WER calculation to use lists instead of tensors. (#1413)