Port of Facebook's LLaMA model in C/C++
Run models like Kimi-K2.5, GLM-5, DeepSeek, gpt-oss, Gemma, Qwen etc.
Low-latency AI inference engine optimized for mobile devices
AI video generator optimized for low VRAM and older GPUs use
Flux 2 image generation model pure C inference
llama and other large language models on iOS and MacOS offline
Locally run an Instruction-Tuned Chat-Style LLM
AI macOS app for real-time coding interview coaching assistance
mujoco-py allows using MuJoCo from Python 3
Multiagent simulator of road traffic in Qt/C++ and OpenStreetMap.