Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-05-30 | 1.0 kB | |
v3.3.2 source code.tar.gz | 2025-05-30 | 3.2 MB | |
v3.3.2 source code.zip | 2025-05-30 | 3.8 MB | |
Totals: 3 Items | 7.0 MB | 1 |
Gaudi improvements.
What's Changed
- upgrade to new vllm extension ops(fix issue in exponential bucketing) by @sywangyi in https://github.com/huggingface/text-generation-inference/pull/3239
- Nix: switch to hf-nix by @danieldk in https://github.com/huggingface/text-generation-inference/pull/3240
- Add Qwen3 by @yuanwu2017 in https://github.com/huggingface/text-generation-inference/pull/3229
- fp8 compressed_tensors w8a8 support by @sywangyi in https://github.com/huggingface/text-generation-inference/pull/3242
- [Gaudi] Fix the OOM issue of Llama-4-Scout-17B-16E-Instruct by @yuanwu2017 in https://github.com/huggingface/text-generation-inference/pull/3245
- Fix the Llama-4-Maverick-17B-128E crash issue by @yuanwu2017 in https://github.com/huggingface/text-generation-inference/pull/3246
- Prepare for 3.3.2 by @danieldk in https://github.com/huggingface/text-generation-inference/pull/3249
Full Changelog: https://github.com/huggingface/text-generation-inference/compare/v3.3.1...v3.3.2