Alibaba's high-performance LLM inference engine for diverse apps
Mooncake is the serving platform for Kimi
Fast Multimodal LLM on Mobile Devices
A @ClickHouse fork that supports high-performance vector search
Emscripten: An LLVM-to-WebAssembly Compiler
High-speed Large Language Model Serving for Local Deployment