Solve puzzles. Learn CUDA
Performance-optimized AI inference on your GPUs
Running large language models on a single GPU
AirLLM 70B inference with single 4GB GPU
How to optimize some algorithm in cuda
Parallax is a distributed model serving framework
Run Local LLMs on Any Device. Open-source
SkyPilot: Run AI and batch jobs on any infra
AI agents running research on single-GPU nanochat training
Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real
A high-quality rapid TTS voice cloning model
A sound cloning tool with a web interface, using your voice
State-of-the-art Parameter-Efficient Fine-Tuning
Voice Recognition to Text Tool
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Making large AI models cheaper, faster and more accessible
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Simplifies the local serving of AI models from any source
High-Resolution Image Synthesis with Latent Diffusion Models
Official inference framework for 1-bit LLMs
YOLOv5 is the world's most loved vision AI
AI video generator optimized for low VRAM and older GPUs use
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Open deep learning compiler stack for cpu, gpu, etc.
Wan2.1: Open and Advanced Large-Scale Video Generative Model