FasterTransformer is a high-performance inference library designed to accelerate transformer-based models such as BERT, GPT, and T5 on NVIDIA GPUs. It provides optimized implementations of transformer encoder and decoder layers using CUDA, cuBLAS, and custom kernels to maximize throughput and minimize latency. The library supports multiple deep learning frameworks, including TensorFlow, PyTorch, and Triton, allowing developers to integrate it into existing pipelines without major changes. It includes advanced optimization techniques such as mixed precision, tensor parallelism, and efficient memory management, enabling large models to run across multiple GPUs and nodes. FasterTransformer is particularly focused on inference workloads, where it significantly improves performance compared to standard framework implementations. Although development has transitioned toward TensorRT-LLM, the project remains an important reference for understanding optimized transformer execution.

Features

  • Optimized transformer encoder and decoder implementations
  • Support for BERT, GPT, T5, and related architectures
  • Multi-GPU and multi-node inference with parallelism
  • Mixed precision support including FP16 and INT8
  • Integration with TensorFlow, PyTorch, and Triton
  • High-performance CUDA and cuBLAS-based kernels

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow FasterTransformer

FasterTransformer Web Site

Other Useful Business Software
AI-generated apps that pass security review Icon
AI-generated apps that pass security review

Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
Try Retool free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of FasterTransformer!

Additional Project Details

Programming Language

C++

Related Categories

C++ Artificial Intelligence Software

Registered

2026-03-18