Intel Extension for Transformers is an innovative toolkit designed to accelerate Transformer-based models on Intel platforms, including CPUs and GPUs. It offers state-of-the-art compression techniques for Large Language Models (LLMs) and provides tools to build chatbots within minutes on various devices. The extension aims to optimize the performance of Transformer-based models, making them more efficient and accessible.

Features

  • Acceleration of Transformer-based models​
  • Optimization for Intel CPUs and GPUs​
  • State-of-the-art compression techniques for LLMs​
  • Rapid chatbot development tools​
  • Support for various devices​
  • Enhanced performance and efficiency​
  • Integration with existing AI workflows​
  • Open-source toolkit​
  • Comprehensive documentation and support​

Project Samples

Project Activity

See All Activity >

Categories

LLM Inference

License

Apache License V2.0

Follow Intel Extension for Transformers

Intel Extension for Transformers Web Site

Other Useful Business Software
Full-stack observability with actually useful AI | Grafana Cloud Icon
Full-stack observability with actually useful AI | Grafana Cloud

Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Create free account
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Intel Extension for Transformers!

Additional Project Details

Operating Systems

Mac

Programming Language

Python

Related Categories

Python LLM Inference Tool

Registered

2025-03-18