| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| Intel(r) Extension for Transformers v1.4.1 Release source code.tar.gz | 2024-04-21 | 103.4 MB | |
| Intel(r) Extension for Transformers v1.4.1 Release source code.zip | 2024-04-21 | 106.3 MB | |
| README.md | 2024-04-21 | 2.2 kB | |
| Totals: 3 Items | 209.7 MB | 0 | |
Highlights Improvements Examples Bug Fixing
Highlights
- Support Weight-only Quantization on MTL iGPU
- Upgrade lm-eval to 0.4.2
- Support Llama3
Improvements
- Support TPP for Xeon Tensor Parallel (5f0430f )
- Refine Model
from_pretrainedWhenuse_neural_speed(39ecf38e )
Examples
- Add vision front-end demo (1c6550 )
- Add example for table extraction, and enabled multi-page table handling pipeline (db9e6fb )
- Adapted textual inversion distillation for quantization example to latest transformers and diffusers packages (0ec83b1 )
- Update NeuralChat Notebooks (83bb65a, 629b9d4 )
Bug Fixing
- Fix QBits actshuf buf overflow under large batch (a6f3ab3 )
- Fix TPP support for single socket (a690072 )
- Fix retrieval dependency (281b0a3 )
- Fix loading issue of woq model with parameters (37f9db25 )
Validated Configurations
- Python 3.10
- Ubuntu 22.04
- PyTorch 2.2.0+cpu
- Intel® Extension for Torch 2.2.0+cpu