Running a big model on a small laptop
Port of OpenAI's Whisper model in C/C++
DeepSeek 4 Flash local inference engine for Metal
Unified KV Cache Compression Methods for Auto-Regressive Models
Mobile and Web client for Codex and Claude Code, with realtime voice
ByteHook is an Android PLT hook library
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Fast Multimodal LLM on Mobile Devices
Document Management System and Content Management System
Speech recognition application builder and library
Consilium – User Defined sentence Suggestion Tool.
Vision-language-action model for robot control via images and text