Real-time NVIDIA GPU dashboard
How to optimize some algorithm in cuda
High-performance library for gradient boosting on decision trees
157 models, 30 providers, one command to find what runs on hardware
OpenLIT is an open-source LLM Observability tool
Run serverless GPU workloads with fast cold starts on bare-metal
Fast and memory-efficient exact attention
High-speed Large Language Model Serving for Local Deployment
Running a big model on a small laptop
A high-performance inference engine for AI models
AI agents running research on single-GPU nanochat training
GPU accelerated decision optimization
Relax! Flux is the ML library that doesn't make you tensor
Python inference and LoRA trainer package for the LTX-2 audio–video
Performance-optimized AI inference on your GPUs
Open-source Agent Operating System
The Modular Platform (includes MAX & Mojo)
High-performance, multiplayer code editor from the creators of Atom
HeavyDB (formerly MapD/OmniSciDB)
Supercharge Your LLM with the Fastest KV Cache Layer
Flux 2 image generation model pure C inference
UCCL is an efficient communication library for GPUs
Unified KV Cache Compression Methods for Auto-Regressive Models
Large Language Model Text Generation Inference
An opinionated CLI to transcribe Audio files w/ Whisper on-device