Image generation model with single-stream diffusion transformer
Real-time NVIDIA GPU dashboard
Solve puzzles. Learn CUDA
AirLLM 70B inference with single 4GB GPU
Running large language models on a single GPU
157 models, 30 providers, one command to find what runs on hardware
How to optimize some algorithm in cuda
HeavyDB (formerly MapD/OmniSciDB)
Fast-stable-diffusion + DreamBooth
Unified KV Cache Compression Methods for Auto-Regressive Models
From Vibe Coding to Agentic Engineering
Terminal-native coding agent powered by local LLMs
Python inference and LoRA trainer package for the LTX-2 audio–video
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Structure-from-Motion and Multi-View Stereo
UCCL is an efficient communication library for GPUs
A high-quality rapid TTS voice cloning model
A TTS that fits in your CPU (and pocket)
Open-source deep-learning framework for building and training
A python tool that uses GPT-4, FFmpeg, and OpenCV
Run Local LLMs on Any Device. Open-source
Fast LLM speculative inference server for consumer hardware
A straightforward method for training your LLM
Running a big model on a small laptop
Please do not feed the models