Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-08-29 | 1.4 kB | |
v3.9.1 source code.tar.gz | 2025-08-29 | 13.3 MB | |
v3.9.1 source code.zip | 2025-08-29 | 16.6 MB | |
Totals: 3 Items | 29.9 MB | 1 |
This is a patch release containing the following changes to v3.9:
* Reduced sizes in Graph API SDPA examples (257d689ade952ba61fa926e8ba8127685133ccd2)
* Fixed correctness issue in bf16
depthwise convolution with bf16
bias on AArch64 CPUs (218b41ddb3e9e63cc6f317c02cd79a4a1e4b06a0)
* Changed Intel GPU data alignment check from error to warning (5c5008a8cca72cc89650e9530cf2838d28f26277)
* Improved bf16
matmul performance on processors with Intel AMX instruction set support (54b63549e97599a58a3ae6ab3e9a381f4ff03c46, 30c4d8d9d967bbcf54ba753305ca392282c692f3)
* Fixed PowerPC64 build by adding -mcpu=power10
and -mmma
flags (02ca915a3f79ed558edecad390c6adc166ac5d35)
* Introduced support for f16
destination in int8
matmul and int8
inner product on x64 CPUs (a62ed6b88db80bbfa8574a54522fcf11fed534f6, 53c0a667a218ba3dbe9fa7ec1492bf2893f90e78, 07500433f58ec0a25c2a100b7a951f2ade44d162, 4f0f068e02af35ec1e70f7a7113227716f76730f)
* Introduced support per_tensor
zero-points in int8
matmul on Intel GPUs (db8e8ff737016d16fdebb2a931ecbd6aa7a16e3a, f78316439500896a5922d7495867fdc246e2d4ac, 4d458df41ca5e498e8e858ab6b4590e5679db34d, 80453a01bbee0c641c22e72279fca204c036f6c9, 7f90d50536a2dc769348a222cb39f77445d907ac, a2200e2c372d51d493b627e4a99de02991a3279d)
* Fixed correctness issue in int8
reorder for cases with compensation on x64 CPUs (771ca54f64aa8a43fdf50db33d8739dc407502e8)