Supercharge Your LLM with the Fastest KV Cache Layer
High-performance, multiplayer code editor from the creators of Atom
Pruna is a model optimization framework built for developers
Official mirror of libplacebo
Enables the best performance on NVIDIA RTX Graphics Cards
UCCL is an efficient communication library for GPUs
Unified KV Cache Compression Methods for Auto-Regressive Models
Flux 2 image generation model pure C inference
High-performance CPU, GPU, and memory profiler for Python
Large Language Model Text Generation Inference
A high performance anime upscaler
Boosted trees in Julia
Vulkan-based implementation of D3D9, D3D10 and D3D11 for Linux / Wine
Monitor temperature sensors, fan speed, voltage, load & clock speeds
Lightweight, high-performance HTML renderer for game developers
A set of AI-enabled effects, generators, and analyzers for Audacity
Connect to remote ffmpeg servers
Faster Whisper transcription with CTranslate2
Bridging Reasoning and Action Prediction
A modern cross-platform low-level graphics API
Alibaba's high-performance LLM inference engine for diverse apps
Easily compute clip embeddings and build a clip retrieval system
GPU-accelerated GUI development for Node.js and the browser
Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun
Lightweight Armoury Crate alternative for Asus laptops and ROG Ally