Download Latest Version Intel(r) Extension for Transformers v1.4.2 Release source code.tar.gz (103.6 MB)
Email in envelope

Get an email when there's a new version of Intel Extension for Transformers

Home / v1.4.1
Name Modified Size InfoDownloads / Week
Parent folder
Intel(r) Extension for Transformers v1.4.1 Release source code.tar.gz 2024-04-21 103.4 MB
Intel(r) Extension for Transformers v1.4.1 Release source code.zip 2024-04-21 106.3 MB
README.md 2024-04-21 2.2 kB
Totals: 3 Items   209.7 MB 0

Highlights Improvements Examples Bug Fixing

Highlights

  • Support Weight-only Quantization on MTL iGPU
  • Upgrade lm-eval to 0.4.2
  • Support Llama3

Improvements

  • Support TPP for Xeon Tensor Parallel (5f0430f )
  • Refine Model from_pretrained When use_neural_speed (39ecf38e )

Examples

  • Add vision front-end demo (1c6550 )
  • Add example for table extraction, and enabled multi-page table handling pipeline (db9e6fb )
  • Adapted textual inversion distillation for quantization example to latest transformers and diffusers packages (0ec83b1 )
  • Update NeuralChat Notebooks (83bb65a, 629b9d4 )

Bug Fixing

  • Fix QBits actshuf buf overflow under large batch (a6f3ab3 )
  • Fix TPP support for single socket (a690072 )
  • Fix retrieval dependency (281b0a3 )
  • Fix loading issue of woq model with parameters (37f9db25 )

Validated Configurations

  • Python 3.10
  • Ubuntu 22.04
  • PyTorch 2.2.0+cpu
  • Intel® Extension for Torch 2.2.0+cpu
Source: README.md, updated 2024-04-21