Run Local LLMs on Any Device. Open-source
NVR with realtime local object detection for IP cameras
Universal LLM Deployment Engine with ML Compilation
157 models, 30 providers, one command to find what runs on hardware
Machine learning on FPGAs using HLS
Fast LLM speculative inference server for consumer hardware
AirLLM 70B inference with single 4GB GPU
Official repository for LTX-Video
Realtime AI Voice Agents with SoTA Multimodal AI models on Arduino ESP
TT-NN operator library, and TT-Metalium low level kernel programming
Fast Multimodal LLM on Mobile Devices
TTS with kokoro and onnx runtime
Official inference repo for FLUX.1 models
Any model. Any hardware. Zero compromise
Parallax is a distributed model serving framework
Official inference framework for 1-bit LLMs
AI video generator optimized for low VRAM and older GPUs use
Fast ML inference & training for ONNX models in Rust
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Phi-3.5 for Mac: Locally-run Vision and Language Models
Tensor library for machine learning
The RF and reverse engineering framework for everyone
Run OpenClaw on a $5 chip
Clippy, now with some AI
Please do not feed the models