Download Latest Version Intel(r) Extension for PyTorch_ v2.8.0+cpu Release Notes source code.tar.gz (49.8 MB)
Email in envelope

Get an email when there's a new version of Intel Extension for PyTorch

Home / v2.8.10+xpu
Name Modified Size InfoDownloads / Week
Parent folder
Intel(r) Extension for PyTorch_ v2.8.10+xpu Release Notes source code.tar.gz 2025-08-06 14.7 MB
Intel(r) Extension for PyTorch_ v2.8.10+xpu Release Notes source code.zip 2025-08-06 15.9 MB
README.md 2025-08-06 3.2 kB
Totals: 3 Items   30.6 MB 0

We launched Intel® Extension for PyTorch* in 2020 with the goal of extending the official PyTorch* to simplify achieving high performance on Intel® CPU and GPU platforms. Over the years, we have successfully upstreamed most of our features and optimizations for Intel® platforms into PyTorch*. Moving forward, our strategy is to focus on developing new features and supporting upcoming platform launches directly within PyTorch*. We are discontinuing active development on Intel® Extension for PyTorch*, effective immediately after 2.8 release. We will continue to provide critical bug fixes and security patches throughout the PyTorch* 2.9 timeframe to ensure a smooth transition for our partners and the community.

2.8.10+xpu

Intel® Extension for PyTorch* v2.8.10+xpu is the new release which supports Intel® GPU platforms (Intel® Arc™ Graphics family, Intel® Core™ Ultra Processors with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 with Intel® Arc™ Graphics, Intel® Core™ Ultra Series 2 Mobile Processors and Intel® Data Center GPU Max Series) based on PyTorch* 2.8.0.

Highlights

  • Intel® oneDNN v3.8.1 integration
  • Intel® Deep Learning Essentials 2025.1.3 compatibility
  • Large Language Model (LLM) optimization

Intel® Extension for PyTorch* optimizes the performance of Qwen3, along with other typical LLM models on Intel® GPU platforms,with the supported transformer version upgraded to 4.51.3. A full list of optimized LLM models is available in the LLM Optimizations Overview. Intel® Extension for PyTorch* also adds the support for more custom kernels, such as selective_scan_fn, causal_conv1d_fn and causal_conv1d_update, for the functionality support of Jamba model.

  • PyTorch* XCCL adoption for distributed scenarios

Intel® Extension for PyTorch* adopts the PyTorch* XCCL backend for distributed scenarios on the Intel® GPU platform. We observed that the scaling performance using PyTorch* XCCL is on par with OneCCL Bindings for PyTorch* (torch-ccl) for validated AI workloads. As a result, we will discontinue active development of torch-ccl immediately after the 2.8 release.

A pseudocode example illustrating the transition from torch-ccl to PyTorch* XCCL at the model script level is shown below:

```
import torch

if torch.distributed.is_xccl_available:
  torch.distributed.init_process_group(backend='xccl')
else:
  import oneccl_bindings_for_pytorch
  torch.distributed.init_process_group(backend='ccl')      
```
  • Redundant code removal

Intel® Extension for PyTorch* no longer overrides the device allocator. It is recommended to use the allocator provided by PyTorch* instead. Intel® Extension for PyTorch* also removes all overridden oneMKL and oneDNN related operators except GEMM and SDPA.

Known Issues

Please refer to Known Issues webpage.

Source: README.md, updated 2025-08-06