Simplifies the local serving of AI models from any source
Official inference framework for 1-bit LLMs
World's fastest and most advanced password recovery utility
lightweight, standalone C++ inference engine for Google's Gemma models
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
High-Resolution Image Synthesis with Latent Diffusion Models
A image/comic viewer application for Windows, Mac and Linux
AI video generator optimized for low VRAM and older GPUs use
A high-performance, zero-overhead, extensible Python compiler
Open deep learning compiler stack for cpu, gpu, etc.
Python-free Rust inference server
Enables the best performance on NVIDIA RTX Graphics Cards
ArrayFire, a general purpose GPU library
Wan2.1: Open and Advanced Large-Scale Video Generative Model
QVAC Fabric: cross-platform LLM inference and fine-tuning
Driver and tools for controlling Lenovo Legion laptops in Linux
Training neural networks on Apple Neural Engine via APIs
Accelerated libraries for quantum-classical computing built on CUDA-Q
A Python package for extending the official PyTorch
Bailing is a voice dialogue robot similar to GPT-4o
Text and image to video generation: CogVideoX and CogVideo
Khronos Vulkan, OpenGL, and OpenGL ES Conformance Tests
FlashMLA: Efficient Multi-head Latent Attention Kernels
Supercharge Your Model Training
950 line, minimal, extensible LLM inference engine built from scratch