Intel Extension for PyTorch - Browse /v2.6.10+xpu at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
Intel(r) Extension for PyTorch_ v2.6.10+xpu Release Notes source code.tar.gz	2025-03-11	27.7 MB	0
Intel(r) Extension for PyTorch_ v2.6.10+xpu Release Notes source code.zip	2025-03-11	29.4 MB	0
README.md	2025-03-11	4.6 kB	0
Totals: 3 Items		57.1 MB	0

We are excited to announce the release of Intel® Extension for PyTorch* v2.6.10+xpu. This is the new release which supports Intel® GPU platforms (Intel® Data Center GPU Max Series, Intel® Arc™ Graphics family, Intel® Core™ Ultra Processors with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 Mobile Processors and Intel® Data Center GPU Flex Series) based on PyTorch* 2.6.0.

Highlights

Intel® oneDNN v3.7 integration
Intel® oneAPI Base Toolkit 2025.0.1 compatibility
Official PyTorch 2.6 prebuilt binaries support

Starting this release, Intel® Extension for PyTorch* supports official PyTorch prebuilt binaries, as they are built with _GLIBCXX_USE_CXX11_ABI=1 since PyTorch* 2.6 and hence ABI compatible with Intel® Extension for PyTorch* prebuilt binaries which are always built with _GLIBCXX_USE_CXX11_ABI=1.

Large Language Model (LLM) optimization

Intel® Extension for PyTorch* provides support for a variety of custom kernels, which include commonly used kernel fusion techniques, such as rms_norm and rotary_embedding, as well as attention-related kernels like paged_attention and chunked_prefill, and punica kernel for serving multiple LoRA finetuned LLM. It also provides the MoE (Mixture of Experts) custom kernels including topk_softmax, moe_gemm, moe_scatter, moe_gather, etc. These optimizations enhance the functionality and efficiency of the ecosystem on Intel® GPU platform by improving the execution of key operations.

Besides that, Intel® Extension for PyTorch* optimizes more LLM models for inference and finetuning, such as Phi3-vision-128k, phi3-small-128k, llama3.2-11B-vision, etc. A full list of optimized models can be found at LLM Optimizations Overview.

Serving framework support

Intel® Extension for PyTorch* offers extensive support for various ecosystems, including vLLM and TGI, with the goal of enhancing performance and flexibility for LLM workloads on Intel® GPU platforms (intensively verified on Intel® Data Center GPU Max Series and Intel® Arc™ B-Series graphics on Linux). The vLLM/TGI features like chunked prefill, MoE (Mixture of Experts) etc. are supported by the backend kernels provided in Intel® Extension for PyTorch*. The support to low precision such as Weight Only Quantization (WOQ) INT4 is also enhanced in this release: - The performance of INT4 GEMM kernel based on Generalized Post-Training Quantization (GPTQ) algorithm has been improved by approximately 1.3× compared with previous release. During the prefill stage, it achieves similar performance to FP16, while in the decode stage, it outperforms FP16 by approximately 1.5×. - The support of Activation-aware Weight Quantization (AWQ) algorithm is added and the performance is on par with GPTQ without g_idx.

[Prototype] NF4 QLoRA finetuning using BitsAndBytes

Intel® Extension for PyTorch* now supports QLoRA finetuning with BitsAndBytes on Intel® GPU platforms. It enables efficient adaptation of LLMs using NF4 4-bit quantization with LoRA, reducing memory usage while maintaining accuracy.

[Beta] Intel® Core™ Ultra Series 2 Mobile Processors support on Windows

Intel® Extension for PyTorch* provides beta quality support of Intel® Core™ Ultra Series 2 Mobile Processors (codename Arrow Lake-H) on Windows in this release, based on redistributed PyTorch 2.6 prebuilt binaries with additional AOT compilation target for Arrow Lake-H in the download server.

Hybrid ATen operator implementation

Intel® Extension for PyTorch* uses ATen operators available in Torch XPU Operators as much as possible and overrides very limited operators for better performance and broad data type support.

Breaking Changes

Intel® Data Center GPU Flex Series support is being deprecated and will no longer be available starting from the release after v2.6.10+xpu.
Channels Last 1D support on XPU is being deprecated and will no longer be available starting from the release after v2.6.10+xpu.

Known Issues

Please refer to Known Issues webpage.

Source: README.md, updated 2025-03-11

Intel Extension for PyTorch Files

A Python package for extending the official PyTorch

Highlights

Breaking Changes

Known Issues

Intel Extension for PyTorch Files

A Python package for extending the official PyTorch

Get an email when there's a new version of Intel Extension for PyTorch

Highlights

Breaking Changes

Known Issues