A set of utilities for monitoring and customizing GPU performance
Android GPU Inspector
How to optimize some algorithm in cuda
A simple Minecraft modpack focusing on performance and graphics
OpenLIT is an open-source LLM Observability tool
A Python framework for accelerated simulation, data generation
Fast and memory-efficient exact attention
The CUDA target for Numba
Performance meets Productivity
AI agents running research on single-GPU nanochat training
An open-source, GPU-accelerated physics simulation engine
GPU accelerated decision optimization
Development repository for the Triton language and compiler
Meridian is an MMM framework
Performance-optimized AI inference on your GPUs
Python inference and LoRA trainer package for the LTX-2 audio–video
Ongoing research training transformer models at scale
Supercharge Your LLM with the Fastest KV Cache Layer
The Modular Platform (includes MAX & Mojo)
Large Language Model Text Generation Inference
Unified KV Cache Compression Methods for Auto-Regressive Models
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Enables the best performance on NVIDIA RTX Graphics Cards
Bridging Reasoning and Action Prediction
Pruna is a model optimization framework built for developers