Tool for exploring and debugging transformer model behaviors
A library for accelerating Transformer models on NVIDIA GPUs
Learn How LLM Transformer Models Work with Interactive Visualization
Implementation of Vision Transformer, a simple way to achieve SOTA
Ongoing research training transformer models at scale
Fast inference engine for Transformer models
Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
Julia Implementation of Transformer models
A theoretical reconstruction of the Claude Mythos architecture
RF-DETR is a real-time object detection and segmentation
MoBA: Mixture of Block Attention for Long-Context LLMs
Image generation model with single-stream diffusion transformer
Plugin for IntelliJ IDEA that gives special support for Minecraft mods
Build your chatbot within minutes on your favorite device
BitNet: Scaling 1-bit Transformers for Large Language Models
Trained models & code to predict toxic comments
The most powerful local music generation model
Ongoing research training transformer models at scale
Repo for SeedVR2 & SeedVR
ReFT: Representation Finetuning for Language Models
NeurIPS2025 Spotlight] Quantized Attention
Fast and memory-efficient exact attention
Fast State-of-the-Art Static Embeddings
A CSS parser, transformer, and minifier written in Rust