Open-source, high-performance AI model with advanced reasoning
A high-throughput and memory-efficient inference and serving engine
A @ClickHouse fork that supports high-performance vector search
A high-performance inference engine for AI models
TokenSpeed is a speed-of-light LLM inference engine
Diversity-driven optimization and large-model reasoning ability
A Simple and Universal Swarm Intelligence Engine
Open-source large language model family from Tencent Hunyuan
High-performance inference framework for large language models
Advanced LLM-powered brute-force tool combining AI intelligence
Fast, local-first web content extraction for LLMs
Alibaba's high-performance LLM inference engine for diverse apps
A simple, performant and scalable Jax LLM
High-performance Inference and Deployment Toolkit for LLMs and VLMs
High-speed Large Language Model Serving for Local Deployment
slime is an LLM post-training framework for RL Scaling
Mooncake is the serving platform for Kimi
Advanced language and coding AI model
Universal LLM Deployment Engine with ML Compilation
Powerful AI language model (MoE) optimized for efficiency/performance
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Kimi K2 is the large language model series developed by Moonshot AI
SDG is a specialized framework
AirLLM 70B inference with single 4GB GPU