Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
Patch release v4.55.3 source code.tar.gz | 2025-08-21 | 18.9 MB | |
Patch release v4.55.3 source code.zip | 2025-08-21 | 24.0 MB | |
README.md | 2025-08-21 | 741 Bytes | |
Totals: 3 Items | 43.0 MB | 2 |
Patch release 4.55.3
Focused on stabilizing FlashAttention-2 on Ascend NPU, improving FSDP behavior for generic-task models, fixing MXFP4 integration for GPT-OSS
Bug Fixes & Improvements
- FlashAttention-2 / Ascend NPU – Fix “unavailable” runtime error (#40151) by @FightingZhen
- FlashAttention kwargs – Revert FA kwargs preparation to resolve regression (#40161) by @Cyrilvallez
- FSDP (generic-task models) – Fix sharding/runtime issues (#40191) by @Cyrilvallez
- GPT-OSS / MXFP4 – Ensure swiglu_limit is correctly passed through (#40197) by @returnL
- Mamba – Fix cache handling to prevent stale/incorrect state (#40203) by @manueldeprada
- Misc – Minor follow-up fix addressing [#40262] by @ArthurZucker